NCAS-CMS / PyActiveStorage

Python implementation of Active Storage
2 stars 2 forks source link

[DRAFT] Test optimal kerchunk #186

Open valeriupredoi opened 9 months ago

valeriupredoi commented 9 months ago

Description

This is a sandbox testing a slightly modded Kercunk that kerchunks inly a desired Dataset - the good folk at Kerchunk are already investigating means of implementing such an option in Kerchunk in https://github.com/fsspec/kerchunk/pull/424 - and some timing test results can be seen in https://github.com/fsspec/kerchunk/pull/424#issuecomment-1961482253 (after one sets default_cache_type to first as Martin recommended)

Label: needs new Kerchunk

Update as of February 29

With the Kerchunk PR now merged https://github.com/fsspec/kerchunk/pull/424 and the issues related to the newer Kerchunk functionality now ironed out, this is ready for a merge when kerchunk release the new version that contains that PR. Test results are very promising:

Test time / time it spends before going to remote Reductionist
--------------------------------------------------------------
4.87        (0.71)
5.05        (1.06)
7.02        (3.03)
4.84        (0.81)
4.72        (0.74)
4.77        (0.73)
4.75        (0.78)

So all the time spent in PyActiveStorage is sub-1s! This is for the tests/test_compression_remote_reductionist.py::test_compression_and_filters_cmip6_forced_s3_from_local_bigger_file_v1 test.

valeriupredoi commented 8 months ago

it appears the numcodecs pin has not been fixed yet in kerchunk=0.2.4, going to go chase the feedstock

codecov[bot] commented 8 months ago

Codecov Report

Attention: Patch coverage is 84.61538% with 14 lines in your changes missing coverage. Please review.

Project coverage is 88.52%. Comparing base (00d4b4a) to head (730f6ec). Report is 38 commits behind head on main.

:exclamation: Current head 730f6ec differs from pull request most recent head d82b916

Please upload reports for the commit d82b916 to get more accurate results.

Files Patch % Lines
activestorage/netcdf_to_zarr.py 80.82% 14 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #186 +/- ## ========================================== + Coverage 87.78% 88.52% +0.74% ========================================== Files 8 8 Lines 696 741 +45 ========================================== + Hits 611 656 +45 Misses 85 85 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

valeriupredoi commented 8 months ago

OK new Kerchunk=0.2.4 works very well, and I also fixed the dreaded SegFault that was plaguing us until today :partying_face: