NOAA-GSL / ExascaleWorkflowSandbox

Other
2 stars 2 forks source link

Automatic multi-hop data staging #5

Open christopherwharrop-noaa opened 1 year ago

christopherwharrop-noaa commented 1 year ago

NOAA's security posture prevents direct copying of data from external "untrusted" sources in the usual fashion. This means moving data from certain places will require performing the transfer in multiple "hops".

For example, to move data from an untrusted source to Hera:

  1. Copy the data from the untrusted host to niagara:///collab1/data_unstrusted/...
  2. Copy the data from niagara:///collab1/data_unstrusted/... to hera://scratch2/...

or

  1. Copy the data from the untrusted host tohera:///scratch2/data_untrusted/...
  2. Copy the data from hera:///scratch2/data_untrusted/... to hera://scratch2/...

It is not clear how to tell Parsl to perform the multi-hop transfer automatically.

This capability is now available as a GlobusFlow (two-stage-transfer). But it needs testing. And we need to understand how best to serve this Flow to users building NWP workflows.