als-computing / splash_flows_globus

Configuration and code for Prefect workflows to move data and run computing tasks
Other
1 stars 3 forks source link

Globus Flow: Tomography Reconstruction at ALCF #25

Closed davramov closed 1 month ago

davramov commented 4 months ago

Test environment

I isolated the file tree for bl832 and dependencies, specifically this Globus reconstruction flow.

File tree

ALCF_compute_test
│   ├── config.yml  # added a few endpoints
│   └── orchestration
│       ├── __init__.py
│       ├── alcf_transfer.py    # transfer to ALCF in a module (not currently using this)
│       ├── config.py   # stayed the same
│       ├── flows
│       │   └── bl832
│       │       ├── ALCF_tomopy_reconstruction.py   # equivalent to move.py
│       │       ├── config.py   # added a few endpoints
│       │       └── test
│       │           ├── alcf_compute_endpoint_test.py   # test compute endpoint
│       │           ├── alcf_transfer_endpoint_test.py  # test transfer endpoint
│       │           ├── inputOneSliceOfEach.txt # used for reconstruction function input
│       │           ├── reconstruction.py   # copy of reconstruction code on ALCF
│       │           ├── test_cc_auth.py # test endpoint read/write for service client
│       │           └── transfer_test.ipynb 
│       ├── globus.py   # stayed the same
│       ├── globus_flows_utils.py   # copied from the notebook, modified "get_specific_flow_client()"
│       └── globus_tomopy_flow_init.py  # copied globus flow definition from the notebook into a module, but the notebook feels more user-friendly for defining a new flow.

Updates to orchestration/flows/bl832/move.py

The main changes to the production code are the following fuctions:

[new] transfer_data_to_alcf()

Followed structure of transfer_data_to_nersc()

[new] alcf_tomopy_reconstruction_flow()

  1. Read data from ALCF
  2. Run reconstruction on ALCF compute endpoint
  3. Save reconstruction to ALCF

Note: this function needs to be updated to take as inputs:

[modified] process_new_832_file_flow(..., send_to_alcf=False)

Snippet:

   ...
   # Send data from NERSC to ALCF (default is False), process it using Tomopy, and send it back to NERSC
    if not is_export_control and send_to_alcf:            
        # Transfer data from NERSC to ALCF
        transfer_success = transfer_data_to_alcf(fp, config.tc, config.nersc_alsdev, config.alcf_iribeta_cgs)
        ...
        # Run the Tomopy reconstruction flow
        alcf_tomopy_reconstruction_flow()
        ...
        # Send reconstructed data to NERSC
        transfer_success = transfer_data_to_nersc(file_path, config.tc, config.alcf_iribeta_cgs, config.nersc_alsdev)
  ...

davramov commented 4 months ago

Update

As discussed with Dylan, I removed the ALCF_compute_test directory and copied my useful test scripts to the /examples folder. Additionally, instead of merging the ACLF transfer/compute flow into bl832/move.py, I have added bl832/ALCF_compute_reconstruction.py as a separate module.

Note: Data transfer in the following function still needs inputs defined (file source/destination and the required .txt file for reconstruction), which we will determine in Monday's (6/17) meeting with Dula.

@flow(name="alcf_tomopy_reconstruction_flow")
def alcf_tomopy_reconstruction_flow():
davramov commented 3 months ago

June 28 / July 1st Update

I have updated the ALCF_tomopy_reconstruction.py script and corresponding files (reconstruction.py, .env) to account for a file name and folder name when calling the reconstruction Globus Flow. I have also included a specific readme README_ALCF_tomopy_reconstruction.md with an outline of the entire flow, details about folder and file names, step-by-step instructions for configuring the Globus environments, NERSC and ALCF endpoints (transfer and compute), setting up and logging into the confidential client, and running the script.

davramov commented 3 months ago

July 26 Update

Major changes Incorporate tiff_to_zarr.py into Globus Flow on ALCF Polaris File pruning with Prefect worker scheduling updated orchestration/flow/bl832/prune.py Other changes: config.yml and orchestration/flows/bl832/config.py Cleaned up some endpoint names for consistency docs/README_alcf832.md updated to reflect recent progress New files: create_deployments_832_alcf.sh This shell script builds and deploys pruning code for this workflow. examples/Tomopy_for_ALS.ipynb This notebook is also in another repository, but this version matches the changes made to the registered flow function and confidential client for this specific workflow. examples/tiff_to_zarr.py This program reflects the code on Polaris /eagle/IRIbeta/als/examples/tiff_to_zarr.py used in the Globus Flow. Moved and renamed files: examples/test_cc_auth.py scripts/globus_tomopy_flow_init.py Todo: update this script so we can move away from using a Jupyter Notebook to initialize the steps. Globus helper code orchestration/globus/flows.py (was globus_flows_utils.py) orchestration/globus/transfer.py (was orchestration/globus.py) orchestration/flows/bl832/alcf.py (was ALCF_tomopy_reconstruction.py) docs/README_aclf832.md (was orchestration/flows/bl832/README_ALCF_tomopy_reconstruction.md) Files with modified imports due to moving and renaming the Globus helper code, but are otherwise unchanged: orchestration/flows/bl7012/config.py orchestration/flows/bl7012/move.py orchestration/flows/bl7012/move_recon.py orchestration/flows/bl832/move.py orchestration/prefect.py