respec / HSPsquared

Hydrologic Simulation Program Python (HSPsquared)
GNU Affero General Public License v3.0
43 stars 17 forks source link

Merge `setup_pip` in `develop` after testing #73

Closed aufdenkampe closed 2 years ago

aufdenkampe commented 2 years ago

To address

@timcera created PR:

Since then, we've added number of important commits, including those related to:

Therefore, we've pulled PR #65 into a new setup_pip branch based on the latest commits in develop.

@rburghol, feel free to continue your testing with this new branch, which would resolve the issue you described in https://github.com/respec/HSPsquared/issues/69#issuecomment-1006914185.

aufdenkampe commented 2 years ago

@sjordan29, please use the setup_pip branch to test the capabilities described in https://github.com/respec/HSPsquared/pull/65#issue-1066615371, and report back here.

aufdenkampe commented 2 years ago

@sjordan29, I just added some updates from develop, including some slightly modified environment files. For now, we really only need to focus on the main environment.yml, as the new environment_dev.yml introduces Python 3.9 and we don't have time to really test that for this release.

@timcera, I noticed that mando hasn't been updated since 2017 or tested on Python 3.9. We'll be wanting to migrate to Python 3.9 in the next round of work. Is there an alternative to Mando that is being actively maintained? We don't need to figure this out now, but its something that we'll need to figure out in coming months if we want to maintain your CLI functionality.

sjordan29 commented 2 years ago

A few errors:

  1. In a fresh environment, reading the pull version number from "__version__" variable in "HSP2/__init__.py" using from HSP2 import __version__ throws ModuleNotFoundError since importing HSP2 imports modules (e.g. numpy, pandas) that have not yet been installed. A couple thoughts on resolving this:
def get_version(fname):
    with open(fname) as fp:
        for line in fp.readlines():
            if "__version__" in line:
                __version__ = line.split(" ")[2].rstrip()
    return __version__

__version__ = get_version('HSP2/__init__.py')
  1. I added HSP2IO to packages in setup.py: packages=["HSP2", "HSP2tools", "HSP2IO"],
  2. After making these changes, I was able to install from current local directory using pip: pip install . and it appears to successfully build. I was then able to run all of the hsp2 --help examples from https://github.com/respec/HSPsquared/pull/65#issue-1066615371. However, I run into the following error when I actually try to run:
    
    (hsp_testing) C:\Users\smjor\OneDrive\Desktop\HSP2>hsp2 import_uci test10.uci test10.h5
    c:\users\smjor\hsp_testing\lib\site-packages\HSP2tools\readUCI.py:340: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead.  To get a de-fragmented frame, use `newframe = frame.copy()`
    df[par_name] = def_val
    39 reading from wdm
    41 reading from wdm
    42 reading from wdm
    46 reading from wdm
    113 reading from wdm
    119 reading from wdm
    121 reading from wdm
    122 reading from wdm
    123 reading from wdm
    124 reading from wdm
    125 reading from wdm
    126 reading from wdm
    127 reading from wdm
    131 reading from wdm
    132 reading from wdm
    134 reading from wdm
    135 reading from wdm
    136 reading from wdm
    140 reading from wdm

(hsp_testing) C:\Users\smjor\OneDrive\Desktop\HSP2>dir Volume in drive C is Local Disk Volume Serial Number is F282-23A3

Directory of C:\Users\smjor\OneDrive\Desktop\HSP2

01/11/2022 11:36 AM

. 01/11/2022 11:36 AM .. 01/11/2022 11:37 AM 27,513,338 test10.h5 01/11/2022 11:08 AM 26,354,622 test10.hdf 01/11/2022 11:08 AM 37,858 test10.uci 01/11/2022 11:08 AM 163,840 test10.wdm 01/11/2022 11:08 AM 1,999,238 TEST10_hsp2_compare.ipynb 5 File(s) 56,068,896 bytes 2 Dir(s) 24,172,380,160 bytes free

(hsp_testing) C:\Users\smjor\OneDrive\Desktop\HSP2>hsp2 run test10.h5 2022-01-11 11:41:04.11 Processing started for file ./; saveall=False Traceback (most recent call last): File "C:\Users\smjor\hsp_testing\Scripts\hsp2-script.py", line 33, in sys.exit(load_entry_point('HSPsquared===-0.9.3-', 'console_scripts', 'hsp2')()) File "c:\users\smjor\hsp_testing\lib\site-packages\HSP2tools\HSP2_CLI.py", line 56, in main mando.main() File "c:\users\smjor\hsp_testing\lib\site-packages\mando\core.py", line 208, in call return self.execute(sys.argv[1:]) File "c:\users\smjor\hsp_testing\lib\site-packages\mando\core.py", line 204, in execute return command(*a) File "c:\users\smjor\hsp_testing\lib\site-packages\HSP2tools\HSP2_CLI.py", line 22, in run hsp2main(hdfname, saveall=saveall, jupyterlab=jupyterlab) File "c:\users\smjor\hsp_testing\lib\site-packages\HSP2\main.py", line 41, in main uci_obj = io_manager.read_uci() AttributeError: 'str' object has no attribute 'read_uci'


@ptomasula - do you know what's going on with this error? it seems to be related to the new IO code. 
aufdenkampe commented 2 years ago

@sjordan29, thanks for that testing, update, suggestion, and questions to @ptomasula.

I like your idea to move the version number to a single file. Could you do that?

Rather than HSP2/__version__.py, however, could you put it in the repo's root directory, because it applies to all the files in the repo and also so it is easy to find: _version__.py

ptomasula commented 2 years ago

Thanks @sjordan29

I second your suggestion of using a version.py file. That seems simple to implement, hard for something to go wrong, and simple to change in the future should we come up with a better best practice.

With regards to the other error you posted, It is partially my fault for not having main return a more usefully error code (something for a future update), but the issue resulting in the error is whatever code is invoking the 'main' function from main.py (when you execute 'hsp2 run test10.h5') is doing so incorrectly. The attribute error at the bottom of the stack trace is because 'main' received a 'str' instance for the first argument instead of an instance of the 'IOManager' class. Passing a 'str' argument for the first argument was how 'main' functioned prior to the IO Abstraction, so my guess is that this piece of code never got updated. This post in the PR for IO Abstraction documents how to run the model with the new IO Abstraction approach.

Maybe hunt around a bit more to confirm this, but I'm pretty sure that these are the lines of code you'd be looking to update.

aufdenkampe commented 2 years ago

@ptomasula, thanks for pointing us to your instructions for the new I/O abstracted capabilities at https://github.com/respec/HSPsquared/pull/68#issuecomment-998252873.

Is that approach a presently parallel approach to the old readUCI & readWDM methods? So both work right now?

ptomasula commented 2 years ago

@aufdenkampe that is correct. The functions readUCI and readWDM were not impacted by the current implementation of IO abstraction. So no need to update code calling either of those functions at this point.

Though a longer term goal should be to replace those methods with IO abstracted versions that would allow users to convert UCI and WDM files to formats other than HDF5.

timcera commented 2 years ago

@sjordan29, thanks for that testing, update, suggestion, and questions to @ptomasula.

I like your idea to move the version number to a single file. Could you do that?

Rather than HSP2/__version__.py, however, could you put it in the repo's root directory, because it applies to all the files in the repo and also so it is easy to find: _version__.py

I am not proposing this as best practice, but for my projects I keep a "VERSION" text file at the top level of the project and read it into the "setup.py" with something like:

version = open("VERSION").readline().strip()
setup(...
      version=version,
      ...)

No import needed.

As an example: https://github.com/timcera/tstoolbox/blob/master/setup.py

timcera commented 2 years ago

Maybe hunt around a bit more to confirm this, but I'm pretty sure that these are the lines of code you'd be looking to update.

Actually I think need to create an IOManager instance using 'hdfname' string here

sjordan29 commented 2 years ago

Thanks, @ptomasula, @aufdenkampe , @timcera for these suggestions.

  1. For now, _version.py is within the HSP2 directory. Doing so allowed for easy import in __init__.py to avoid an error (it seems that __version__ is referenced somewhere within the code). I can move it to the root directory once we resolve other issues.
  2. @aufdenkampe, I re-pip installed with the updates to the environment.yml file and there were some errors:
    • setup's install_requires parameter requires that specific versions are specified with a double "==" rather than a single "=".
    • The following packages could not be installed: - hdf5 ==1.10.6 and # - nb_conda. I removed these from the .yml file. Will this impact the tutorial notebooks?
  3. I updated the code where @timcera suggested. This resolved the AttributeError.
  4. Now there is a new issue:
    (hsp_testing_2) C:\Users\smjor\OneDrive\Desktop\HSP2>hsp2 run test10.h5
    2022-01-13 11:33:27.14   Processing started for file ./; saveall=False
    2022-01-13 11:33:40.88   Simulation Start: 1976-01-01 00:00:00, Stop: 1977-01-01 00:00:00
    2022-01-13 11:33:40.88      PERLND P001 DELT(minutes): 60
    2022-01-13 11:33:49.86         SNOW
    2022-01-13 11:34:16.28         PWATER
    2022-01-13 11:34:44.20         PSTEMP
    Save DataFrame Empty for RESULTS/PERLND_P001/PSTEMP
    2022-01-13 11:34:49.52         PWTGAS
    Save DataFrame Empty for RESULTS/PERLND_P001/PWTGAS
    2022-01-13 11:35:01.10      RCHRES R001 DELT(minutes): 60
    Traceback (most recent call last):
    File "C:\Users\smjor\hsp_testing_2\Scripts\hsp2-script.py", line 33, in <module>
    sys.exit(load_entry_point('HSPsquared==0.9.3', 'console_scripts', 'hsp2')())
    File "c:\users\smjor\hsp_testing_2\lib\site-packages\HSP2tools\HSP2_CLI.py", line 60, in main
    mando.main()
    File "c:\users\smjor\hsp_testing_2\lib\site-packages\mando\core.py", line 208, in __call__
    return self.execute(sys.argv[1:])
    File "c:\users\smjor\hsp_testing_2\lib\site-packages\mando\core.py", line 204, in execute
    return command(*a)
    File "c:\users\smjor\hsp_testing_2\lib\site-packages\HSP2tools\HSP2_CLI.py", line 26, in run
    hsp2main(io_manager, saveall=saveall, jupyterlab=jupyterlab)
    File "c:\users\smjor\hsp_testing_2\lib\site-packages\HSP2\main.py", line 87, in main
    get_flows(io_manager, ts, flags, uci, segment, ddlinks, ddmasslinks, siminfo['steps'], msg)
    File "c:\users\smjor\hsp_testing_2\lib\site-packages\HSP2\main.py", line 333, in get_flows
    data_frame = io_manager.read_ts(Category.RESULTS,x.SVOL,x.SVOLNO, sgrpn)
    File "c:\users\smjor\hsp_testing_2\lib\site-packages\HSP2IO\io.py", line 84, in read_ts
    return self._output.read_ts(category, operation, segment, activity)
    File "c:\users\smjor\hsp_testing_2\lib\site-packages\HSP2IO\hdf.py", line 84, in read_ts
    return read_hdf(self._store, path)
    File "c:\users\smjor\hsp_testing_2\lib\site-packages\pandas\io\pytables.py", line 425, in read_hdf
    return store.select(
    File "c:\users\smjor\hsp_testing_2\lib\site-packages\pandas\io\pytables.py", line 820, in select
    raise KeyError(f"No object named {key} in the file")
    KeyError: 'No object named RESULTS/PERLND_P001/PWTGAS in the file'
    Closing remaining open files:test10.h5...done

    I can continue to dig into this, but wanted to flag the error in case there is a simple fix.

sjordan29 commented 2 years ago

KeyError resolved -- we had to change the default value for saveall this line to True. Thanks @ptomasula!

I am now able to pip install, then set up and run the model from the command line.

sjordan29 commented 2 years ago

@timcera that is good to know. The error with the install_requires parameter started when @aufdenkampe added more specific version numbers to the environment.yml file, which appear to require the == to be properly parsed; the function worked perfectly with the prior version of the environment.yml file.

One thought - I can try to update the process_env_yaml function to add that additional equals sign when necessary.

aufdenkampe commented 2 years ago

@timcera and @sjordan29, thanks for all your help with the PR! I think it is ready to merge!