pysat / pysatNASA

pysat support for NASA Instruments
BSD 3-Clause "New" or "Revised" License
21 stars 6 forks source link

ENH: Allows pysatCDF to be preferentially used when present. #102

Closed rstoneback closed 2 years ago

rstoneback commented 2 years ago

Description

Addresses # (issue)

cdflib performance is not what I'd prefer. Loading C/NOFS takes about 60s with cdflib through pysat while loading via pysatCDF through pysat only takes about 1s. Performance difference matters for science. Working on a slightly tweaked pysatCDF that actually compiles on my modern system.

Type of change

Please delete options that are not relevant.

How Has This Been Tested?

I'm developing examples for pysatSeasons, or rather, fixing the existing example. Either way, I've been able to load a 67 day season of C/NOFS IVM data with this branch. Will run pysatNASA unit tests.

Test Configuration

Checklist:

rstoneback commented 2 years ago

This goes along with this https://github.com/pysat/pysatCDF/pull/33

rstoneback commented 2 years ago

While trying out the pysatModels example over in https://github.com/pysat/pysatModels/pull/102 turns out that metadata isn't being handled as it should for CINDI, and likely the other NASA instruments.

rstoneback commented 2 years ago

Pushed a commit to pysatCDF and now the match example works fine. Metalabel loading working fine now as well.

rstoneback commented 2 years ago

Eeek!

Collecting pysatCDF

          import numpy as np

[363](https://github.com/pysat/pysatNASA/runs/6412744056?check_suite_focus=true#step:4:363)
      ModuleNotFoundError: No module named 'numpy'
[364](https://github.com/pysat/pysatNASA/runs/6412744056?check_suite_focus=true#step:4:364)
      [end of output]
rstoneback commented 2 years ago

Good news is tests for this branch locally pass from a pip install of pysatCDF.

rstoneback commented 2 years ago

Eeek!

Collecting pysatCDF

          import numpy as np

[363](https://github.com/pysat/pysatNASA/runs/6412744056?check_suite_focus=true#step:4:363)
      ModuleNotFoundError: No module named 'numpy'
[364](https://github.com/pysat/pysatNASA/runs/6412744056?check_suite_focus=true#step:4:364)
      [end of output]

@jklenzing I think this is the same error we had with OMMBV. pysatCDF has moved to setup.cfg but since it has actual code in setup.py for the NASA build process there is an import numpy as np at the top. Not sure we can do anything about that in this case. Wasn't there a -no-binary option or something like that somewhere....

jklenzing commented 2 years ago

I need to think about how the extra tests will be merged in the new test structure. @rstoneback, is this still on the roadmap?

rstoneback commented 2 years ago

Yep!

I need to think about how the extra tests will be merged in the new test structure. @rstoneback https://github.com/rstoneback, is this still on the roadmap? —

jklenzing commented 2 years ago

@rstoneback, the pysatCDF install fails on three of the five CI environments. It doesn't work for windows or for the old versions of numpy (this may be due to the env definition, we install pysatCDF then update numpy to an older version).

The current setup should be fine and at least test pysatCDF in the two environments. For the others, we basically run cdflib load tests twice (once while forced, once during nominal runs when pysatCDF fails).

I've pulled pysatCDF out of the requirements lists since this is an optional install, but we should probably have some info on the readme for users who want to employ this option. This means that there will always be a working version for a quick install, and users looking for speed can grab the optional package.

rstoneback commented 2 years ago

It doesn't work for windows

I think that is the whole fortran on windows general issue. At least I hope so. At one point I believe I iterated with Barbara Emery on getting pysatCDF on windows installed. It was awhile ago but nevertheless I’m hoping it still carries on.

old versions of numpy (this may be due to the env definition, we install pysatCDF then update numpy to an older version).

Hmmm….. Thanks for the heads up.

I've pulled pysatCDF out of the requirements lists since this is an optional install, but we should probably have some info on the readme for users who want to employ this option.

Fair to pull it out of requirements. Perhaps it should be in test_requirements.txt . I was principally trying to get it on GitHub Actions to test outside my own environment.

This means that there will always be a working version for a quick install, and users looking for speed can grab the optional package.

Perfect. That is the intention.

jklenzing commented 2 years ago

There's a separate install line in the main action, so it will get installed, just not always correctly.

rstoneback commented 2 years ago

It sounds like we are as good as it is currently going to get then.