pytroll / pytroll-collectors

Collector modules for Pytroll
GNU General Public License v3.0
3 stars 18 forks source link

Refactor the fsspec to message functions #104

Closed mraspaud closed 2 years ago

mraspaud commented 2 years ago

This PR refactors the fsspec to message functions to make them more easily reusable.

@paulovcmedeiros you might be interested in this.

This needs https://github.com/fsspec/filesystem_spec/pull/950 to work fully

codecov[bot] commented 2 years ago

Codecov Report

Merging #104 (b19fd2b) into main (d85bcf9) will increase coverage by 1.75%. The diff coverage is 99.28%.

@@            Coverage Diff             @@
##             main     #104      +/-   ##
==========================================
+ Coverage   85.82%   87.57%   +1.75%     
==========================================
  Files          23       24       +1     
  Lines        3210     3357     +147     
==========================================
+ Hits         2755     2940     +185     
+ Misses        455      417      -38     
Impacted Files Coverage Δ
pytroll_collectors/fsspec_to_message.py 97.93% <97.93%> (ø)
pytroll_collectors/s3stalker.py 73.01% <100.00%> (-10.80%) :arrow_down:
pytroll_collectors/tests/test_fsspec_to_message.py 100.00% <100.00%> (ø)
pytroll_collectors/tests/test_s3stalker.py 100.00% <100.00%> (+0.44%) :arrow_up:
pytroll_collectors/triggers/__init__.py 57.14% <0.00%> (-12.86%) :arrow_down:
pytroll_collectors/helper_functions.py 67.30% <0.00%> (ø)
...troll_collectors/tests/test_geographic_gatherer.py 100.00% <0.00%> (ø)
pytroll_collectors/file_notifiers.py
pytroll_collectors/geographic_gatherer.py 86.42% <0.00%> (+0.93%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update d85bcf9...b19fd2b. Read the comment docs.

mraspaud commented 2 years ago

@pnuu Example messages generated with the new functions:

pytroll://1b/hrit-segment/0deg file a001673@c21748.ad.smhi.se 2022-04-28T14:36:25.364426 v1.01 application/json {"filesystem": {"cls": "fsspec.implementations.sftp.SFTPFileSystem", "protocol": "ssh", "args": [], "host": "c21748.ad.smhi.se"}, "uri": "ssh://c21748.ad.smhi.se/tmp/pytest-of-a001673/pytest-57/test_simple_publishing0/source/H-000-202204281436-__", "uid": "ssh://c21748.ad.smhi.se/tmp/pytest-of-a001673/pytest-57/test_simple_publishing0/source/H-000-202204281436-__", "sensor": "seviri", "variant": "0DEG"}

pytroll://1b/hrit-segment/0deg dataset a001673@c21748.ad.smhi.se 2022-04-28T14:36:28.429881 v1.01 application/json {"dataset": [{"filesystem": {"cls": "fsspec.implementations.tar.TarFileSystem", "protocol": "tar", "args": [], "fo": "/tmp/pytest-of-a001673/pytest-57/test_simple_publishing_with_un0/source/H-000-202204281436-__.tar", "target_protocol": "ssh", "target_options": {"host": "c21748.ad.smhi.se", "protocol": "ssh"}}, "uri": "tar://tmp/pytest-of-a001673/pytest-57/test_simple_publishing_with_un0/source/H-000-202204281436-__::ssh://c21748.ad.smhi.se/tmp/pytest-of-a001673/pytest-57/test_simple_publishing_with_un0/source/H-000-202204281436-__.tar", "uid": "tar://tmp/pytest-of-a001673/pytest-57/test_simple_publishing_with_un0/source/H-000-202204281436-__"}], "sensor": "seviri", "variant": "0DEG"}
pnuu commented 2 years ago

Please remind me, what was the use for the filesystem part of the message dictionary? As far as I can see it isn't needed by fsspec:

uri = "tar://tmp/bar.txt::ssh://127.0.0.1/tmp/test.tar"
fsfiles = fsspec.open_files(uri)
print(fsfiles[0])
# <OpenFile 'tmp/bar.txt'>

And what does fo stand for? If "output file", then isn't the tar file actually the input where the data is coming from and then stored locally as uid?

mraspaud commented 2 years ago

the filesystem item is just here to make sure we have everything. Indeed, if we were to include eg filecache, the specific options for it can't be added to the uri, but need to be provided as storage options inside filesystem. Now, it is true that the implemented functions do not add filecache yet...

Regarding fo, from what I gather it means file object.

mraspaud commented 2 years ago

@pnuu there was a significant change in the last commit, maybe you want to have a look at it again.