I found the sniffles --help to be difficult to read. I've refactored it so that text is less wide and I've separated --help and --example (detailed below). This is technically a breaking change in that parameters which used the tobool as a type were all replaced with action="store_false". Therefore, any pipeline which has hard coded calls to e.g. sniffles --qc-stdev false would need to be updated to sniffles --qc-stdev
Note that the formatting in the below examples is a different from what would be seen in a terminal due to github applying formatting.
default `sniffles` output
usage: sniffles --input SORTED_INPUT.bam [--vcf OUTPUT.vcf] [--snf MERGEABLE_OUTPUT.snf] [--threads 4] [--mosaic]
Sniffles2: A fast structural variant (SV) caller for long-read sequencing data
Version 2.4
Contact: sniffles@romanek.at
Use --help for full parameter information
Use --example for detailed usage information
sniffles: error: the following arguments are required: -i/--input
`sniffles --help` output
usage: sniffles --input SORTED_INPUT.bam [--vcf OUTPUT.vcf] [--snf MERGEABLE_OUTPUT.snf] [--threads 4] [--mosaic]
Sniffles2: A fast structural variant (SV) caller for long-read sequencing data
Version 2.4
Contact: sniffles@romanek.at
Use --help for full parameter information
Use --example for detailed usage information
options:
-h, --help show this help message and exit
--example Show example usage and exit
--version show program's version number and exit
Common parameters:
-i IN [IN ...], --input IN [IN ...]
For single-sample calling: A coordinate-sorted and indexed .bam/.cram
(BAM/CRAM format) file containing aligned reads. - OR - For multi-sample
calling: Multiple .snf files (generated before by running Sniffles2 for
individual samples with --snf)
-v OUT.vcf, --vcf OUT.vcf
VCF output filename to write the called and refined SVs to. If the given
filename ends with .gz, the VCF file will be automatically bgzipped and a
.tbi index built for it.
--snf OUT.snf Sniffles2 file (.snf) output filename to store candidates for later multi-
sample calling
--reference REF.fa (Optional) Reference sequence the reads were aligned against. To enable
output of deletion SV sequences, this parameter must be set.
--tandem-repeats IN.bed
(Optional) Input .bed file containing tandem repeat annotations for the
reference genome.
--regions REG.bed (Optional) Only process the specified regions.
-c, --contig (Optional) Only process the specified contigs. May be given more than once.
--phase Determine phase for SV calls (requires the input alignments to be phased)
-t, --threads Number of parallel threads to use (4)
SV Filtering parameters:
--minsupport Min number of supporting reads for a SV to be reported (auto)
--minsupport-auto-mult
Coverage based auto-minsupport multiplier for germline mode (0.1/0.025)
--minsvlen Min SV length in bp (50)
--minsvlen-screen-ratio
Min length for SV candidates as fraction of --minsvlen (0.9)
--mapq Alignments with mapping quality lower than this value will be ignored
--no-qc, --qc-output-all
Output all SV candidates, disregarding quality control steps
--qc-stdev Apply filtering based on SV start position and length standard deviation
--qc-stdev-abs-max Max standard deviation for SV length and size in bp (500)
--qc-strand Apply filtering based on strand support of SV calls
--qc-coverage Min surrounding region coverage of SV calls (1)
--long-ins-length Insertion SVs longer than this are subjected to more sensitive filtering
(2500)
--long-del-length Deletion SVs longer than this are subjected to central coverage drop-based
filtering. Not applicable for --mosaic (50000)
--long-inv-length Inversion SVs longer than this value are not subjected to central coverage
drop-based filtering (10000)
--long-del-coverage Long deletions with central coverage higher than this value will be
filtered. Not applicable for --mosaic (0.66)
--long-dup-length Duplication SVs longer than this value are subjected to central coverage
increase-based filtering. Not applicable for --mosaic (50000)
--qc-bnd-filter-strand
Filter breakends that do not have support for both strands
--bnd-min-split-length
Min length of read splits to be considered for breakends (1000)
--long-dup-coverage Long duplications with central coverage lower than this value will be
filtered. Not applicable for --mosaic (1.33)
--max-splits-kb Additional number of splits per kilobase read sequence allowed before reads
are ignored (0.1)
--max-splits-base N Base number of splits allowed before reads are ignored (3)
--min-alignment-length
Reads with alignments shorter than this length in bp will be ignored
--phase-conflict-threshold
Max fraction of conflicting reads permitted for SV phase information to be
labelled as PASS. Only for --phase (0.1)
--detect-large-ins Infer insertions that are longer than most reads and therefore are spanned
by few alignments only.
SV Clustering parameters:
--cluster-binsize Initial screening bin size in bp (100)
--cluster-r Multiplier for SV start position standard deviation criterion in cluster
merging (2.5)
--cluster-repeat-h Multiplier for mean SV length criterion for tandem repeat cluster merging
(1.5)
--cluster-repeat-h-max
Max. merging distance based on SV length criterion for tandem repeat cluster
merging (1000)
--cluster-merge-pos Max. merging distance for insertions and deletions on the same read and
cluster in non-repeat regions (150)
--cluster-merge-len Max. size difference for merging SVs as fraction of SV length (0.33)
--cluster-merge-bnd Max. merging distance for breakend SV candidates (1000)
SV Genotyping parameters:
--genotype-ploidy Sample ploidy (2)
--genotype-error Estimated false positive rate for leads (0.05)
--sample-id Custom ID for this sample (SAMPLE))
--genotype-vcf IN.vcf
Forced calling input.vcf
Multi-Sample Calling / Combine parameters:
--combine-high-confidence
Min fraction of passed QC samples an SV needs (0.0)
--combine-low-confidence
Min fraction of present samples an SV needs (0.2)
--combine-low-confidence-abs
Min number of present samples an SV needs (2)
--combine-null-min-coverage
Min coverage for a genotype to be reported as 0/0 instead of ./. (5)
--combine-match Multiplier for maximum deviation of multiple SV's start/end position for
them to be combined across samples. Given by
max_dev=M*sqrt(min(SV_length_a,SV_length_b)), where M is this parameter
(250)
--combine-match-max Upper limit for the max deviation computed for --combine-match, in bp (1000)
--combine-separate-intra
Disable combination of SVs within the same sample
--combine-output-filtered
Include low-confidence / mosaic SVs in multi-calling
--combine-pair-relabel
Override low-quality genotypes when combining paired samples
--combine-pair-relabel-threshold
Genotype quality minimum before relabeling (20)
--combine-close-handles
Close .SNF file handles after each use to avoid opened files ulimit when
merging many samples.
--combine-pctseq Min alignment distance as percent of SV length to be merged. 0=off (0.7)
Output formatting parameters:
--output-rnames Output names supporting reads in INFO/RNAME
--no-consensus Disable consensus sequence generation for insertion SV calls
--no-sort Do not sort output VCF
--no-progress Disable progress display
--quiet Disable any non-error logging
--max-del-seq-len Max deletion sequence length in output before writing as symbolic \
(50000)
--symbolic Output all SVs as symbolic
--allow-overwrite Allow overwriting existing output files
Mosaic/somatic calling mode parameters:
--mosaic Turn on mosaic calling
--mosaic-af-max Max allele frequency for which SVs are considered mosaic (0.2)
--mosaic-af-min Min allele frequency for mosaic SVs to be output (0.05)
--mosaic-qc-invdup-min-length
Min SV length for mosaic inversion and duplication SVs (500)
--mosaic-qc-coverage-max-change-frac
Max relative coverage change across breakpoints (0.1)
--mosaic-qc-strand Apply filtering based on strand support of calls
--mosaic-include-germline
Report germline SVs as well in mosaic mode
Developer parameters:
--combine-consensus Output the consensus genotype of all samples
--qc-coverage-max-change-frac F
Max relative coverage change across SV breakpoints
`sniffles --example` output
sniffles example commands:
Call SVs for a single sample
-> sniffles --input sorted_indexed_alignments.bam --vcf output.vcf
... OR, with CRAM input and bgzipped+tabix indexed VCF output:
-> sniffles --input sample.cram --vcf output.vcf.gz
... OR, producing only a SNF file with SV candidates:
-> sniffles --input sample1.bam --snf sample1.snf
... OR, simultaneously produce a single-sample VCF and SNF file:
-> sniffles --input sample1.bam --vcf sample1.vcf.gz --snf sample1.snf
... OR, with tandem repeat annotations, reference (for DEL sequences) and mosaic mode for detecting rare SVs:
-> sniffles --input sample1.bam --vcf sample1.vcf.gz --tandem-repeats tandem_repeats.bed --reference genome.fa --mosaic
Multi-sample calling
Step 1. Create .snf for each sample:
-> sniffles --input sample1.bam --snf sample1.snf
Step 2. Combined calling:
-> sniffles --input sample1.snf sample2.snf ... sampleN.snf --vcf multisample.vcf
... OR, using a .tsv file containing a list of .snf files and sample ids (one sample per line):
Step 2. Combined calling:
-> sniffles --input snf_files_list.tsv --vcf multisample.vcf
Determine genotypes for a set of known SVs (force calling)
-> sniffles --input sample.bam --genotype-vcf input_known_svs.vcf --vcf output_genotypes.vcf
Hello,
I found the sniffles
--help
to be difficult to read. I've refactored it so that text is less wide and I've separated--help
and--example
(detailed below). This is technically a breaking change in that parameters which used thetobool
as a type were all replaced withaction="store_false"
. Therefore, any pipeline which has hard coded calls to e.g.sniffles --qc-stdev false
would need to be updated tosniffles --qc-stdev
Note that the formatting in the below examples is a different from what would be seen in a terminal due to github applying formatting.
default `sniffles` output
usage: sniffles --input SORTED_INPUT.bam [--vcf OUTPUT.vcf] [--snf MERGEABLE_OUTPUT.snf] [--threads 4] [--mosaic] Sniffles2: A fast structural variant (SV) caller for long-read sequencing data Version 2.4 Contact: sniffles@romanek.at Use --help for full parameter information Use --example for detailed usage information sniffles: error: the following arguments are required: -i/--input
`sniffles --help` output
`sniffles --example` output