Open iskandr opened 5 years ago
One possible wrinkle: Manta requires python 2.6 or 2.7. I'm running it inside a python3 conda env but I think it's picking up the base installed Python:
The configManta.py
script starts with:
#!/usr/bin/env python2
Files generated by Manta:
*diploidSV.vcf.gz*
SVs and indels scored and genotyped under a diploid model for the set of samples in a joint diploid sample analysis or for the normal sample in a tumor/normal subtraction analysis. In the case of a tumor/normal subtraction, the scores in this file do not reflect any information from the tumor sample.
*somaticSV.vcf.gz*
SVs and indels scored under a somatic variant model. This file will only be produced if a tumor sample alignment file is supplied during configuration
*candidateSV.vcf.gz*
Unscored SV and indel candidates. Only a minimal amount of supporting evidence is required for an SV to be entered as a candidate in this file. An SV or indel must be a candidate to be considered for scoring, therefore an SV cannot appear in the other VCF outputs if it is not present in this file. Note that by default this file includes indels of size 8 and larger. The smallest indels in this set are intended to be passed on to a small variant caller without scoring by manta itself (by default manta scoring starts at size 50).
*candidateSmallIndels.vcf.gz*
Subset of the candidateSV.vcf.gz file containing only simple insertion and deletion variants less than the minimum scored variant size (50 by default). Passing this file to a small variant caller will provide continuous coverage over all indel sizes when the small variant caller and manta outputs are evaluated together. Alternate small indel candidate sets can be parsed out of the candidateSV.vcf.gz file if this candidate set is not appropriate.
The passing somatic structural variants are in somaticSV.vcf.gz
. The smaller indels get filtered out into candidateSmallIndels.vcf.gz
, which should be used as an input to Strelka2.
Installation:
Usage:
Followed by:
Notes:
--exome
flag, for WGS data (which is less likely to have very deep coverage regions), omit this flag.--callRegions
flag. This expects a BED file, for example this GRCh38-specific BED file: https://github.com/Illumina/manta/blob/master/docs/userGuide/README.md#extended-use-cases -- other genomes will need their own BED file and this option should be omitted for custom genomes.