globus / globus-flows-trigger-examples

Sample code for triggering Globus Flows using the Python watchdog library.
Apache License 2.0
7 stars 7 forks source link

Tar/Untar in a Flow? #19

Open ax3l opened 4 months ago

ax3l commented 4 months ago

Hi,

A common workflow I want to express in globus is to:

And the opposite:

Is there a flow for this or could this be made an example?

Currently, for really large data sets I am still doing a combination of Globus + screen for untar, and there must be a better way.

ada-globus commented 4 months ago

Hi!

The cleanest way to solve this today would be with Compute endpoints on both sides (in other words, with direct access to the files on each collection): The flow could then invoke Compute functions on each of the endpoints to tar and untar as needed before/after the transfer.

We're not quite ready to document a general purpose approach to this yet, but I do think this is a very good idea, so I'm adding a story to our internal tracker to create a tutorial for the docs site on how to do this. I can't give an ETA on when that will get tackled, but it will be in the hopper.

If you could use assistance with your particular configuration in the meantime, opening a ticket via support@globus.org will help us make sure that we can provide support that is specific to your scenario.

vasv commented 4 months ago

This repo includes a basic example of the flow that Ada mentions; see https://github.com/globus/globus-flows-trigger-examples/tree/main/tar_transfer and its associated trigger script. A simple wrapper function for the "tar" action is in https://github.com/globus/globus-flows-trigger-examples/blob/main/functions/tar_function.py. Let us know if you end up using/extending this in the interim.