als-computing / splash_flows_globus

Other
1 stars 3 forks source link

Add option for ALCF #24

Open dylanmcreynolds opened 3 months ago

dylanmcreynolds commented 3 months ago

We want to add an option to the 832 flows code that lets us move data to ALCF and launch reconstructions via Globus Flows. We code that does this in another repo, but want to add this code to the production flows.

I envision the general flow to look like: A new JSON block will be created in prefect that lets beamline staff turn this on (default off). If on, we copy to both ALCF and NERSC, and run the reconstruction at ALS.

There is a bit of code for working with Globus flows that we will want to add as generic utility code to: https://github.com/als-computing/splash_flows_globus/blob/main/orchestration/globus.py, and we will create a new task in https://github.com/als-computing/splash_flows_globus/blob/main/orchestration/flows/bl832/move.py that can be called if we're running ALCF. The goal is to use the existing confidential client setup that we have in production for the same globus authentication configuration that we have for movement.

davramov commented 2 months ago

File Flow

We met with Dula to confirm the following folder/file paths and naming conventions for the reconstruction data generated on ALCF.

Additionally, we should update reconstruction.py on ALCF to not listen for a .txt file, and instead, we should update the main function of to accept the raw filename and input/output folders on ALCF. Additionally, we need to update the reconstruction_wrapper() function from the Globus Flow to take the raw file path as input.

Naming conventions

ALCF Raw Destination

ALCF Recon Destination

NERSC Destination

ALS Destination (data832)