cheapandfair / cheapandfair-gateways-2024

0 stars 2 forks source link

notebook to copy dataset and create a manifest #1

Closed rpwagner closed 4 days ago

rpwagner commented 2 weeks ago

This notebook shows how to use Globus to copy a dataset from one guest collection to another. It also generates the manifest and uploads it to the destination.

We can save the manifest locally to help build the dataset landing pages.

Needs more narrative text and maybe some diagrams, unless those end up in the presentation.

zonca commented 5 days ago

@rpwagner ahah creating the manifests like this is magical!

zonca commented 5 days ago

@rpwagner what about making copydatasets more general and moving it to the template repo? also I think we should keep 2 ways of creating the manifests, either manual or automatic, so both script should exist in the template repo then people can choose what they need.

zonca commented 5 days ago

@rpwagner what about making copydatasets more general and moving it to the template repo?

I can make a first pass at this then ask for you to review

rpwagner commented 5 days ago

@rpwagner what about making copydatasets more general and moving it to the template repo?

I can make a first pass at this then ask for you to review

This is all good by me. To make it fully general a user should fully specify both the source and destination, like <source UUID>:/<source path> and <destination UUID>:/<destination path>. Then we just have a special case when we know where a source datasets is.

The other thing it will need to do is handle mapped collections. The tokens need extra scopes for those. I can add that to the login section of the code.

I would also split the part that generates the manifests into a separate module that can be called as a script. That way, you could create a manifest for any transfer, even if you submitted the transfer through the Globus web app or the CLI.