bailey-lab / MIPTools

A suite of computational tools used for molecular inversion probe design, data processing, and analysis.
https://miptools.readthedocs.io
MIT License
6 stars 9 forks source link

Use basespace CLI to download run data #13

Closed arisp99 closed 2 years ago

arisp99 commented 2 years ago

This PR replaces the old python script for downloading data from the Illumina BaseSpace Sequence Hub.

Resolves #8

arisp99 commented 2 years ago

The old method of downloading data created a directory within the run directory, which is different than this current method. We need to ensure that we can still demux the data. This may involve changing the demux scripts with the updated directory structure. The contents of the directories are the same.

New download app:

$ tree new_method -L 1
new_method
├── <run_id>.json
├── Config
├── Data
├── InstrumentAnalyticsLogs
├── InterOp
├── Logs
├── Recipe
├── RTAComplete.txt
├── RTAConfiguration.xml
├── RTALogs
├── RTARead1Complete.txt
├── RTARead2Complete.txt
├── RTARead3Complete.txt
├── RTARead4Complete.txt
├── RunCompletionStatus.xml
├── RunInfo.xml
├── RunParameters.xml
└── SampleSheet.csv

7 directories, 11 files

Old download app:

$ tree old_method -L 2
old_method
├──<run_id>
│   ├── Config
│   ├── Data
│   ├── InstrumentAnalyticsLogs
│   ├── InterOp
│   ├── Logs
│   ├── Recipe
│   ├── RTAComplete.txt
│   ├── RTAConfiguration.xml
│   ├── RTALogs
│   ├── RTARead1Complete.txt
│   ├── RTARead2Complete.txt
│   ├── RTARead3Complete.txt
│   ├── RTARead4Complete.txt
│   ├── RunCompletionStatus.xml
│   ├── RunInfo.xml
│   ├── RunParameters.xml
│   └── SampleSheet.csv
└── nohup.out

8 directories, 11 files
arisp99 commented 2 years ago

After some more investigation, this does not cause any issues with the demux app. Now instead of running the following:

singularity run \
  -B $resource_dir:/opt/resources -B $run_dir:/opt/analysis -B $bcl_dir:/opt/data \
  miptools.sif --app demux -s $sample_list -p $platform

where $run_dir is the directory of the project and bcl_dir=$run_dir"_<run id>", we must change $bcl_dir. We can just replace it with $run_dir as this is where the files from BaseSpace are now held:

singularity run \
  -B $resource_dir:/opt/resources -B $run_dir:/opt/analysis -B $run_dir:/opt/data \
  miptools.sif --app demux -s $sample_list -p $platform

We will get the same output. We may additionally want to consider slightly changing the demux script so that we no longer need to bind $run_dir to both /opt/analysis and /opt/data.