nrminor / oneroof

Base-, Variant-, and Consensus-calling under One Proverbial Roof. Work in progress!
MIT License
3 stars 2 forks source link

Add `--help` parameter to list pipeline-specific configuration settings in the command line #13

Closed nrminor closed 2 weeks ago

nrminor commented 4 weeks ago

See a great example here: https://github.com/isugifNF/blast/blob/master/main.nf.

nrminor commented 2 weeks ago

Also done! Now, when you run nextflow run . --help or nextflow run main.nf --help or nextflow run nrminor/oneroof --help, you get this:


 N E X T F L O W   ~  version 24.08.0-edge

Launching `./main.nf` [curious_gates] DSL2 - revision: bc848acf05

                                                        .8888b
                                                        88   "
.d8888b. 88d888b. .d8888b. 88d888b. .d8888b. .d8888b. 88aaa
88'  `88 88'  `88 88ooood8 88'  `88 88'  `88 88'  `88 88
88.  .88 88    88 88.  ... 88       88.  .88 88.  .88 88
`88888P' dP    dP `88888P' dP       `88888P' `88888P' dP
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo

oneroof: Base-, Variant-, and Consensus-calling under One Proverbial Roof
=========================================================================
`oneroof` is a bespoke bioinformatic pipeline that can handle Oxford
Nanopore POD5 basecalling, Illumina paired-end read-merging, read
alignment and variant-calling, variant-effect annotation, consensus
sequence calling, quality reporting, and phylogenetic tree-building, all
under "one roof."
(version 0.1.0)
=========================================================================

Usage:

The typical command for running `oneroof` is as follows:
nextflow run . --prepped_data inputs/ --primer_bed primers.bed --refseq GENOME.fasta --ref_gbk GENOME.gbk -profile containerless

Mandatory arguments:
--refseq                       The reference sequence to be used for mapping in FASTA format.

Optional arguments:
--primer_bed                   A bed file of primer coordinates relative to the reference provided with the parameters 'refseq' and 'ref_gbk'.
--ref_gbk                      The reference sequence to be used for variant annotation in Genbank format.
--fwd_suffix                   Suffix in the primer bed file denoting whether a primer is forward. Default: '_LEFT'
--rev_suffix                   Suffix in the primer bed file denoting whether a primer is reverse. Default: '_RIGHT'
--remote_pod5_location         A remote location to use with an SSH client to watch for pod5 files in real-time as they are generated. Default: ''
--file_watcher_config          Configuration file for remote file monitoring. Default: ''
--pod5_staging                 Directory where pod5 files are cached as they arrive from the remote location. Default: 'pod5_cache'
--pod5_dir                     Directory where pod5 files are manually transferred if no remote pod5 location is given. Default: ''
--precalled_staging            Directory to watch for Nanopore FASTQs or BAMs as they become available. Default: ''
--prepped_data                 Location of prepped data if pod5 files are already basecalled and demultiplexed. Default: ''
--illumina_fastq_dir           Location of paired-end Illumina FASTQ files to be processed. Default: ''
--model                        Nanopore basecalling model to apply to the provided pod5 data. Default: 'sup@latest'
--model_cache                  Directory to cache basecalling models locally. Default: 'work/basecalling_models'
--kit                          Nanopore barcoding kit used to prepare sequencing libraries. Default: null
--pod5_batch_size              How many pod5 files to basecall at once. Default: null
--basecall_max                 Number of parallel instances of the basecaller to run at once. Default: 1
--max_len                      Maximum acceptable read length. Default: 12345678910
--min_len                      Minimum acceptable read length. Default: 1
--min_qual                     Minimum acceptable average quality for a given read. Default: 20
--secondary                    Enable secondary alignments for each amplicon. Default: null
--max_mismatch                 Maximum number of mismatches allowed when finding primers. Default: 0
--downsample_to                Desired coverage to downsample to. Default: 0 (no downsampling)
--min_consensus_freq           Minimum frequency of a variant base to be included in a consensus sequence. Default: 0.5
--min_haplo_reads              Minimum read support to report an amplicon-haplotype. Default: 2
--snpeff_cache                 Directory to cache a custom snpEff database. Default: 'work/snpEff_cache'
--min_depth_coverage           Minimum depth of coverage. Default: 20
--nextclade_dataset            Nextclade dataset location. Default: null
--nextclade_cache              Directory to cache Nextclade datasets. Default: 'work/nextclade_datasets'
--results                      Where to place the results. Default: 'results'
--cleanup                      Whether to clean up the work directory after a successful run. Default: null

Advanced parameters:
--snpEff_config                Configuration file for snpEff. Default: 'conf/snpeff.config'