This Nextflow pipeline pulls samples from iRODS and converts them to FASTQ files.
main.nf
- the Nextflow pipeline that runs all workflowsmodules/metatable.nf
- a collection of processes that help getting IRODS
metadata for samples listed in --findmeta <samples.csv>
filemodules/getfiles.nf
- a collection of processes that help loading the data (.cram
or .bam
files) from IRODS and converting them to .fastq.gz
filesmodules/upload2ftp.nf
- a collection of processes that help uploading a list of .fastq.gz
files to FTP server (specified in nextflow.config
)nextflow.config
- the configuration script that controls the cluster scheduler, process and containerbin/parser.py
- script that parses metadata from imeta ls
output and saves it in .json
formatbin/combine_meta.py
- script that combines all metadata in .json
format and saves it to .tsv
fileexamples/samples.csv
- an example samples.csv file, contains one colum with sample names (no header is required)examples/run.sh
- an example run script that executes the pipeline.--findmeta
: specify a .csv file with sample names to run a metadata search--cram2fastq
: if specified the script runs conversion of cram files that are found on findmeta
step--meta
: this argument spicifies the .tsv with cram files (potentially from findmeta
step) to run cram2fastq conversion--publish_dir
: path to put the output filess of the pipeline. (default 'results'
)--index_format
: index-format formula for samtools, only if you really know what you're doing (default "i*i*"
)--toftp
: upload the resulting files to the ArrayExpress FTP server (default false
).
--ftp_credenials
, --ftp_host
and --ftp_path
--fastqfiles
: this argument spicifies the .fastq.gz files (potentially from cram2fastq
step) to upload them to the ArrayExpress ftp serverRun a metadata search for a specified list of samples:
nextflow run main.nf --findmeta ./examples/samples.csv
Download cram files (specified in metadata.tsv) from IRODS and convert them to fastq
nextflow run main.nf --cram2fastq --meta metadata/metadata.tsv
Upload fastq files to ftp server (you to set up the ftp server in nextflow.config):
nextflow run main.nf --toftp --fastqfiles ./results/
Combine several steps to run them together
nextflow run main.nf --findmeta ./examples/samples.csv --cram2fastq --toftp
---
title: Nextflow pipeline for retrieving CRAM files stored in IRODS and convert them to FASTQ
---
flowchart TB
subgraph findmeta["Find CRAM metadata"]
direction LR
v0([findCrams])
v1([getMetadata])
v2([parseMetadata])
v3([combineMetadata])
end
subgraph downloadcrams["Covert CRAMS --> FASTQ"]
direction LR
v4([downloadCram])
v5([cramToFastq])
v6([calculateReadLength])
v7([checkATAC])
v8([renameATAC])
v9([saveMetaToJson])
v10([updateMetadata])
end
subgraph uploadtoftp["Upload data to FTP"]
direction LR
v11([concatFastqs])
v12([uploadFTP])
end
v0 --> v1 --> v2 --> v3
v4 --> v5 --> v6 --> v7{10X ATAC}
v11 --> v12
v7 --YES--> v8
v8 --> v9
v7 --NO--> v9
v9 --> v10
findmeta -.-> downloadcrams -.-> uploadtoftp