ioos / compliance-checker

Python tool to check your datasets against compliance standards
http://ioos.github.io/compliance-checker/
Apache License 2.0
108 stars 58 forks source link

Should fix more but now all "Error in sys.excepthook:" #1069

Closed ocefpaf closed 4 months ago

ocefpaf commented 4 months ago

Solves some of the Error in sys.excepthook: reported in #1067

In this PR:

@benjwadams if you agree with these changes please merge and I'll chase the other Error in sys.excepthook in another PR, as far as I can see they are not related to caching but I may be wrong.

ocefpaf commented 4 months ago

pre-commit.ci autofix

codecov[bot] commented 4 months ago

Codecov Report

Attention: Patch coverage is 86.04651% with 18 lines in your changes are missing coverage. Please review.

Project coverage is 81.91%. Comparing base (064aac9) to head (6b98a9d). Report is 20 commits behind head on develop.

:exclamation: Current head 6b98a9d differs from pull request most recent head b137db3. Consider uploading reports for the commit b137db3 to get more accurate results

Files Patch % Lines
compliance_checker/cf/cf_base.py 76.47% 8 Missing :warning:
compliance_checker/cfutil.py 93.25% 0 Missing and 6 partials :warning:
compliance_checker/cf/util.py 25.00% 3 Missing :warning:
compliance_checker/runner.py 0.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #1069 +/- ## =========================================== - Coverage 82.18% 81.91% -0.28% =========================================== Files 24 24 Lines 5171 5171 Branches 1242 1237 -5 =========================================== - Hits 4250 4236 -14 - Misses 622 634 +12 - Partials 299 301 +2 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

ocefpaf commented 4 months ago

I would ignore that pre-commit failure for now. I cannot reproduce that one locally. Seems to be a bug.

jcermauwedu commented 4 months ago

I still see sys.excepthook errors with this PR.

pytest --cache-clear
================================================================================= test session starts ==================================================================================
platform linux -- Python 3.11.9, pytest-8.2.0, pluggy-1.5.0
rootdir: /home/portal/src/test/compliance-checker
configfile: pyproject.toml
plugins: flake8-1.1.1, requests-mock-1.12.1, time-machine-2.14.1
collected 235 items

compliance_checker/tests/test_acdd.py ................                                                                                                                           [  6%]
compliance_checker/tests/test_base.py ........                                                                                                                                   [ 10%]
compliance_checker/tests/test_cf.py ...........................................................................................                                                  [ 48%]
compliance_checker/tests/test_cf_integration.py .................                                                                                                                [ 56%]
compliance_checker/tests/test_cli.py .........                                                                                                                                   [ 60%]
compliance_checker/tests/test_feature_detection.py ...............................                                                                                               [ 73%]
compliance_checker/tests/test_ioos_profile.py ..........................................                                                                                         [ 91%]
compliance_checker/tests/test_ioos_sos.py ..                                                                                                                                     [ 91%]
compliance_checker/tests/test_protocols.py ....s                                                                                                                                 [ 94%]
compliance_checker/tests/test_suite.py .............                                                                                                                             [ 99%]
compliance_checker/tests/test_util.py .                                                                                                                                          [100%]

====================================================================== 234 passed, 1 skipped in 63.22s (0:01:03) =======================================================================
Error in sys.excepthook:

Original exception was:
Error in sys.excepthook:

Original exception was:
Error in sys.excepthook:

Original exception was:
Error in sys.excepthook:

Original exception was:
Error in sys.excepthook:

Original exception was:
Error in sys.excepthook:

Original exception was:
Error in sys.excepthook:

Original exception was:
Error in sys.excepthook:

Original exception was:
ocefpaf commented 4 months ago

I still see sys.excepthook errors with this PR.

Yep. Those are the ones related to the mocked netCDF. I still could not get rid of them... At least we reduced them a bit 😬

ocefpaf commented 4 months ago

@jcermauwedu latest commit squashed the remaining ones. Do you mind taking another look?

BTW, is the segfault gone? I'm concerned about that b/c it can lead to corrupted files!

jcermauwedu commented 4 months ago

I will exercise it a bit to see if the segfault appears.

jcermauwedu commented 4 months ago

After 100 straight iterations, no segfault.

benjwadams commented 4 months ago

@ocefpaf, LRU caching was used because some functions repeatedly used get_variables_by_attributes, considerably slowing down certain datasets, see: https://github.com/ioos/compliance-checker/issues/70

In theory, since the datasets are read-only, only one set of attribute fetches should be needed.

I'm OK with regressing performance since the cache implementation appears to be causing issues.

ocefpaf commented 4 months ago

@ocefpaf, LRU caching was used because some functions repeatedly used get_variables_by_attributes, considerably slowing down certain datasets, see: https://github.com/ioos/compliance-checker/issues/70

@benjwadams, like I mentioned above, get_variables_by_attributes caching was upstreamed. If that is the only source of slow downs, then there will be no performance regressions. See https://github.com/Unidata/netcdf4-python/pull/1253

ocefpaf commented 4 months ago

@benjwadams I'd love to get this one in before the code sprint to avoid code conflicts and to help folks get a dev environment that doesn't segfault.