whole-tale / girder_wholetale

Girder plugin providing basic Whole Tale functionality
BSD 3-Clause "New" or "Revised" License
3 stars 5 forks source link

Minor updates to BDBag/Deriva #519

Closed Xarthisius closed 2 years ago

Xarthisius commented 2 years ago

This PR adds following enhancement to BDBag and Deriva providers:

  1. manifest-<alg>.txt files are parsed and checksums are stored on imported girder objects (see eb2a284d73719ca98107bd06eff3e0f90d8e6c01)
  2. manifest.json is parsed to get additional metadata (see https://github.com/whole-tale/girder_wholetale/commit/d8364faf6359fe3a1a62ca24bf63198b4f2d4a5c). It's mostly stored raw on girder objects, with an exception of mimeType that's now properly set on imported items and their identifiers that are taken from bundledAs.uri section (see https://github.com/whole-tale/girder_wholetale/commit/a1db59412cc29535329b63d925dd612417d95623)
  3. Main identifier is set on the root of dataset and method for retrieving it was added (see https://github.com/whole-tale/girder_wholetale/commit/c4c9cbb40e7896c188b123097ec24998a1268c1d). Makes WT bag export "just work"^{TM}
  4. Adds a proper unique object identifier to the registered dataset (see https://github.com/whole-tale/girder_wholetale/pull/519/commits/cc18930e1cbc1c9dc04e7d1c1ca381deee99803e).

TODO

How to test?

  1. Click on https://girder.local.wholetale.org/api/v1/integration/deriva?url=https%3A%2F%2Fpbcconsortium.s3.amazonaws.com%2Fwholetale%2F5ad7cdf55b0d5007601015b7ff1ea8d6%2F2021-11-09_21.47.58%2FDataset_1-882P.zip&force=false
  2. After importing a Tale, export it as WT Bag (Tale > (tale menu ellipsis) > export Tale)
  3. Confirm that bag is mostly empty (in terms of files), but manifest-md5.txt, fetch.txt and manifest.json contain a lot of entries/remote files. NOTE: Bags are not necessarily complete, nor useful at this stage. Further enhancements will be surely needed.
codecov[bot] commented 2 years ago

Codecov Report

Merging #519 (74ac6fd) into master (a251646) will increase coverage by 0.77%. The diff coverage is 95.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #519      +/-   ##
==========================================
+ Coverage   92.15%   92.92%   +0.77%     
==========================================
  Files          58       58              
  Lines        4460     4508      +48     
==========================================
+ Hits         4110     4189      +79     
+ Misses        350      319      -31     
Impacted Files Coverage Δ
server/lib/deriva/provider.py 96.66% <92.30%> (+28.66%) :arrow_up:
server/lib/bdbag/bdbag_provider.py 93.90% <96.15%> (+2.99%) :arrow_up:
server/lib/resolvers.py 94.02% <100.00%> (+11.94%) :arrow_up:
server/lib/deriva/integration.py 93.54% <0.00%> (+48.38%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update a251646...74ac6fd. Read the comment docs.