CenterForOpenScience / waterbutler

WaterButler is a Python web application for interacting with various file storage services via a single RESTful API, developed at Center for Open Science.
Apache License 2.0
62 stars 76 forks source link

[ENG-4624] [S3 Improvements] Project PR - Waterbutler Part #406

Closed cslzchen closed 1 year ago

cslzchen commented 1 year ago

Purpose

Enables WB to support both bucket-root and subfolder-root configuration for S3.

Credit @Johnetordoff for all the work šŸ‘

Project notion: https://www.notion.so/cos/S3-improvements-476a5b07cb7e4f458e9cd4c77cfa03ec OSF Part: https://github.com/CenterForOpenScience/osf.io/pull/10416

Changes

DevOps Notes

Dev Notes

Here are all child PRs that have been merged into this feature branch.

QA Notes

See QA docs in the project notion page

Documentation

Update our developers doc for S3 (if any)

Side Effects

Ticket

https://openscience.atlassian.net/browse/ENG-4624

coveralls commented 1 year ago

Coverage Status

coverage: 88.925% (-0.09%) from 89.014% when pulling 7683994054dd906c7c82afb0159a9be650cfde3c on feature/s3-improvements into 27457802ac41619599db5b66c69e758611c11582 on develop.

felliott commented 1 year ago

I think the thing we need to be looking at for this is the prepend kwarg to WaterButlerPath. When building a WBPath object, the first arg is the relative path (/bar/baz.txt) and the prepend kwarg is the storage provider root (/foo/). Such a WBPath would represent the file /foo/bar/baz.txt on the storage provider, but it's path metadata will be reported as /bar/baz.txt since that is the part relative to the project root. See the owncloud provider for a reference. owncloud is both path-based and supports subfolder roots and is probably the closest thing to what you want to do here.

The other thing i would recommend is to go ahead and do the {bucket}:{storage_root} split in the init method. You can save the results in attributes and refer to them throughout the code.

cslzchen commented 1 year ago

fyi, Johnny's response to CR https://github.com/CenterForOpenScience/waterbutler/pull/409 has been merged as https://github.com/CenterForOpenScience/waterbutler/pull/406/commits/7683994054dd906c7c82afb0159a9be650cfde3c