DARPA-CRITICALMAAS / uiuc-pipeline

hpc pipeline for candidate models inference.
0 stars 1 forks source link

CriticalMAAS Pipeline

This is the internal UIUC git repository for the DARPA CMAAS inference pipeline. The pipeline is designed to be run on hydro.

Quickstart

Installing
For Users For model inference, you will need to pull the container image and run the model using Apptainer. ```bash apptainer pull -F criticalmaas-pipeline_latest.sif docker://ncsa/criticalmaas-pipeline:latest apptainer run --nv -B /projects/bbym/saxton/MockValData:/data -B ./feedback:/feedback -B ./logs:/logs -B ./output:/output ./criticalmaas-pipeline_latest.sif -v --data /data/validation --output /output --legends /data/validation --log /logs/log.log --model flat_iceberg --validation /data/validation_labels --output_types raster_masks ``` **Note that `latest` can be replaced with `pr-#` as per the user preference of version.* \ *Make sure to change `./criticalmaas-pipeline_{latest}.sif` in the apptainer run command accordingly.* ```bash # here, instead of latest, we are using pr-6 apptainer pull -F criticalmaas-pipeline_pr-6.sif docker://ncsa/criticalmaas-pipeline:pr-6 ```
For Developers To get started with this pipeline you will need to clone the repository. We recommend using python venv here to keep the working environment clean. ```bash git clone https://github.com/DARPA-CRITICALMAAS/uiuc-pipeline.git cd uiuc-pipeline ``` If you are on hydro for the first time, you will need to load the anaconda3_gpu module and create a new conda environment. If conda is already installed on your system, you can skip these two lines. ```bash module load anaconda3_gpu conda init ``` This repository also makes use of submodules which need to be initialized and updated. ```bash git submodule init git submodule update ``` We now create new conda and venv environments and install the [requirements.txt](https://github.com/DARPA-CRITICALMAAS/uiuc-pipeline/blob/readme-update/requirements.txt). ```bash conda create --name CMAAS_py_3.10 python=3.10 conda activate CMAAS_py_3.10 python3 -m venv venv source venv/bin/activate # submodule must be updated before installing the requirements pip install -r requirements.txt ```
Understanding Pipeline Inputs To perform inference with our pipeline, only one data input is required and that is the map that you want to perform inference on. There are other data inputs we can use to speed up and perform optional steps with. Each of these optional inputs needs to be structured so that the name is consistent with the input map. E.g. if you have `CA_Sage.tif` the legend will need be named `CA_Sage.json` This is visualization of what that structure looks like. ```bash data ├── Map_1.tif ├── Map_2.tif ├── ... └── Map_N.tif legends # Optional ├── Map_1.json ├── Map_2.json ├── ... └── Map_N.json layouts # Optional ├── Map_1.json ├── Map_2.json ├── ... └── Map_N.json validation # Optional ├── Map_1_lgd_1_poly.tif ├── Map_1_lgd_2_poly.tif ├── ... ├── Map_1_lgd_N_poly.tif ├── ... ├── Map_N_lgd_1_poly.tif ├── Map_N_lgd_2_poly.tif ├── ... └── MapN_lgdN_poly.tif ``` It is also important to note that if you specify `--legends` and there is no corresponding legend for a map file, that is completely fine. Pipeline will just fallback to generating a legend for that specfic map. The same is true for `--layouts` and `--validation`.
Performing Inference To perform inference with one of our models, we will need to run pipeline.py. Pipeline.py has 3 core required arguments to run: * --model : The model to use for inference. * --data : Directory containing data to perform inference on. * --output : Directory to save the output data of the pipeline to. The list of available models can be found [below](#available-models) with the release-tag being what you want to use for the argument. _*Note that you must have a GPU available to run pipeline.py_ ```bash # Example call to pipeline.py python pipeline.py --model "golden_muscat" --data mydata/images/ --output mydata/output/ ``` Running this will have "golden-muscat" run inference on every `.tif` file in the directory specifed by `--data`. The output rasters of this inference will then be saved as `.tif`s to the directory specifed by `--output` along with a geopackage file for each map. The geopackage file contains vector data for each legend item in the map. Output is saved as the pipeline runs so even if the pipeline were to crash in the middle of running, all maps that ran before the crash will have been saved. By default the pipeline will save logging information to `logs/Latest.log` this can be useful if you have any problems or want to see a detailed view of what the pipeline is doing. You can also change the log file location with `--log`. For the further documentation on all the pipeline options see [below](#pipeline-parameters).
Running on Hydro For running the pipeline on hydro there are two options. You can manually run the pipeline with an interactive srun session or we can submit an automatic job using sbatch. You can learn how to manually run with srun in the [hydro docs](https://docs.ncsa.illinois.edu/systems/hydro/en/latest/user-guide/running-jobs.html#srun). You will need to make sure to srun with `--partition=a100` flag as these are the only nodes with GPUs on hydro. For running with sbatch we have two scripts `submit.sh` and `start_pipeline.sh`. When we run `submit.sh` that script will automatically start `start_pipeline.sh` on an a100 node. First, we will want to set the parameters for pipeline.py in `start_pipeline.sh`. Then, once we are ready to run, all we have to do is call ```bash sbatch submit.sh ``` and that will start the job. We can view our pipelines progess by looking at `logs/job_%yourjobid%.log`. The slurm logs can also be found at `logs/slurm/%yourjobid%.e` if you have any errors. **Hint `tail -f logs/job_%yourjobid%.log` can be very useful for viewing these logs. You can also use `nvitop` when on the node that is running the job to view GPU statistics in real-time.* **Please note that our job script assumes that you are using venv to setup your environment. If you are using another python environment manager, E.g. Conda or virtualenvwrapper, you will need to adapt the start_pipeline.sh script to your setup.*
Understanding Pipeline Outputs Pipeline can produce quite a few output files so it can be important to understand what each is. The key arguments here ar `--output` and `--feedback` as they control what files the pipeline will output and where. `--output` controls where the results of inference will get saved; A Raster tif for each legend and a geopackage for each map containing the vectorized legend data for every legend. `--feedback` controls whether the pipeline will output files that are intended for debugging. When feedback is enabled, the pipeline will save any legend data that was generated by the pipeline, save the image of the legend label, create a visualization image for each legend analyzed in the validation step, and save the validation score csv for each individual map. This results in the following output structure. ```bash output ├── full_dataset_scores.csv # If validation was enabled and feedback was not ├── Map1_lgd1.tif ├── ... ├── Map1_lgdN.tif ├── Map1.gpkg ├── ... ├── MapN_lgd1.tif ├── ... ├── MapN_lgdN.tif └── MapN.gpkg feedback ├── full_dataset_scores.csv # If validation was enabled ├── Map1 │   ├── Map1.json # If a map legend was generated by pipeline │   ├── Map1_Scores.csv # If validation was enabled │   ├── lgd_map1_lgd1.tif # Legend label image │   ├── ... # '' │   ├── lgd_map1_lgdN.tif # '' │   ├── val_map1_lgd1.tif # Legend validation image │   ├── ... # '' │   └── val_map1_lgdN.tif # '' ├── ... └── MapN    ├── MapN.json # If a map legend was generated by pipeline    ├── MapN_Scores.csv # If validation was enabled    ├── lgd_mapN_lgd1.tif # Legend label image    ├── ... # ''    ├── lgd_mapN_lgdN.tif # ''    ├── val_mapN_lgd1.tif # Legend validation image    ├── ... # ''    └── val_mapN_lgdN.tif # '' ``` Note that if feedback is not turned on and validation is, pipeline will still save all the scores in the output directory to `#%data%_results.csv`

FAQ

Q. Where is data on hydro?

A. /projects/bbym/shared/data

Q. I've updated to the latest pipeline commit and it doesn't work.

A. New requirements could have been added or submodules could have been updated. It's always a good idea to run the following commands if you are having issues after updating to the most recent commit.

pip install -r requirements.txt
git submodule init
git submodule update

Q. I'm having a problem with the pipeline that I couldn't find help for.

A. Use the issues tab underneath the Plan tab to submit an issue describing your problem.

Documentation

Pipeline Parameters

//: # (* --config : optional ## Not implemented yet ##
The config file to use for the pipeline. Not implemented yet)

Pipeline Config # Not implemented yet

Pipeline config will likely contain model-specific config options. Some planned options are below.

Available Models

Point Models

Flat-Iceberg Git Repository : https://github.com/Dongjiahua/DARPA_torch
Lead Developer : [![Git profile image](https://wsrv.nl/?url=https://avatars.githubusercontent.com/u/48640121?v=4&w=28&h=28&mask=circle) Dong Jiahua](https://github.com/Dongjiahua)
Description : Legend Image promptable point detection model
Release Tags :
* flat_iceberg
Drab-Volcano Git Repository : https://github.com/Dongjiahua/DARPA_torch/tree/drab_volcano
Lead Developer : [![Git profile image](https://wsrv.nl/?url=https://avatars.githubusercontent.com/u/48640121?v=4&w=28&h=28&mask=circle) Dong Jiahua](https://github.com/Dongjiahua)
Description : Pre-trained point detection model, works on a predefined set of point types.
Release Tags :
* drab_volcano

Polygon Models

Golden-Muscat Git Repository : https://github.com/xiyuez2/Darpa_Unet_Release
Lead Developer : [![Git profile image](https://wsrv.nl/?url=https://avatars.githubusercontent.com/u/125916796?v=4&w=28&h=28&mask=circle) xiyuez2](https://github.com/xiyuez2)
Description : U-net model
Release Tags :
* golden_muscat
Rigid-Wasabi Git Repository : https://github.com/xiyuez2/Darpa_Unet_Release
Lead Developer : [![Git profile image](https://wsrv.nl/?url=https://avatars.githubusercontent.com/u/125916796?v=4&w=28&h=28&mask=circle) xiyuez2](https://github.com/xiyuez2)
Description : U-net model
Release Tags :
* rigid_wasabi
Blaring-Foundry Git Repository : https://github.com/xiyuez2/Darpa_Unet_Release
Lead Developer : [![Git profile image](https://wsrv.nl/?url=https://avatars.githubusercontent.com/u/73417516?v=4&w=28&h=28&mask=circle) ziruiwang409](https://github.com/ziruiwang409)
Description : Superpixel U-net model
Release Tags :
* blaring_foundry

Appendix

USGS Json format

The current version of the USGS json format closely follows same format as the one used for the CMA competition, but now the shape_type field indicates whether there are 2 points (rectangle) or 4 points (quad) to describe the bounding box.

USGS json format diagram

Current as of Jan 2024

Uncharted Json format

The Uncharted Json format is based on the TA1 schema (i.e., list of PageExtraction schema objects) and has three segmentation classes which each contain a bounding contour:

Uncharted json format diagram

Current as of Jan 2024