wfondrie / ppx

A Python interface to proteomics data repositories
https://ppx.readthedocs.io
MIT License
29 stars 4 forks source link

Add support for cloud storage backends #26

Closed sooheon closed 2 years ago

sooheon commented 2 years ago

Resolves #24

Adds one dependency: cloudpathlib

cloudpathlib is API compatible with pathlib. Very few lines needed to be changed: refraining from calling Path directly in ftp.py, and dispatching to either Path or CloudPath in project.py when local is received.

No tests, not sure how to test without leaking credentials. Any feedback?

wfondrie commented 2 years ago

Thanks @sooheon! I'll take a close look at the code soon. Cloudpathlib seems like a reasonably popular and well-maintained package, so I have no problem adding it as a standard dependency.

As for testing - you're right that credentials do pose a problem. I'll look into what other folks do (such as cloudpathlib itself) for testing.

wfondrie commented 2 years ago

Also, I took a look at how we can add tests. Fortunately cloudpathlib provides ways to setup tests! https://cloudpathlib.drivendata.org/stable/testing_mocked_cloudpathlib/

sooheon commented 2 years ago

Thanks for the comments, looks good to me.

Do you have further pointers / opinions on tests? I can see the docs but still a little lost how to structure them and what functionality should be tested.

codecov-commenter commented 2 years ago

Codecov Report

Merging #26 (ba1722a) into master (7f26c14) will increase coverage by 0.00%. The diff coverage is 97.56%.

@@           Coverage Diff           @@
##           master      #26   +/-   ##
=======================================
  Coverage   93.90%   93.90%           
=======================================
  Files           9        9           
  Lines         574      591   +17     
=======================================
+ Hits          539      555   +16     
- Misses         35       36    +1     
Impacted Files Coverage Δ
ppx/factory.py 88.60% <ø> (ø)
ppx/massive.py 97.36% <ø> (ø)
ppx/pride.py 94.73% <ø> (-0.20%) :arrow_down:
ppx/ftp.py 97.01% <94.11%> (-0.57%) :arrow_down:
ppx/config.py 100.00% <100.00%> (ø)
ppx/project.py 95.28% <100.00%> (-0.22%) :arrow_down:
ppx/utils.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 7f26c14...ba1722a. Read the comment docs.

wfondrie commented 2 years ago

Hi @sooheon - sorry it took me so long to get around to this. I ended up writing tests and reorganizing things a bit, but I think it's ready to go!

For the tests, I had to create a new pytest fixture using the local module within CloudPathLib to mock up a real CloudPath: https://github.com/wfondrie/ppx/pull/26/files#diff-e52e4ddd58b7ef887ab03c04116e676f6280b824ab7469d5d3080e5cba4f2128R35-R52

I then ended up using this fixture to re-run some of the download tests using a cloud URI in tests/unit_tests/test_cloud.py. I also added tests to verify that using ppx.set_data_dir() or setting PPX_DATA_DIR to a cloud path worked correct in tests/unit_tests/test_config.py.

sooheon commented 2 years ago

Thanks, learned something new.

wfondrie commented 2 years ago

Thank you for taking the issue and running with it!

I'm going to mention the new feature on Twitter soon - do you have a handle I could mention to highlight it was your contribution? Of course, it's also ok if you have a handle, but don't want to be mentioned.

sooheon commented 2 years ago

Thanks, it's @de__cide