bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
994 stars 354 forks source link

Can't add extra parameters to mutect2 #3270

Closed waemm closed 4 years ago

waemm commented 4 years ago

Version info

Hi guys,

I am trying to add the "--max-mnp-distance 0" command to mutect2 for building a PON database. I have not managed to get it to include this parameter in the final YAML file no matter which way I specify it. Not sure if there is something I am doing wrong here?

Thanks!

Tumor run template


details:

  • algorithm: aligner: false mark_duplicates: false recalibrate: false remove_lcr: false variantcaller: mutect2 ensemble: numpass: 2 variant_regions: /shared/pipeline-user/run_data/baits/sureselectv7/S31285117_Padded.bed vcfanno: [somatic] tools_on:
    • damage_filter analysis: variant2 genome_build: hg38 description: TD1717_test
  • resources: mutect2: options: [" --max-mnp-distance","0"] fc_date: '2020-02-10' fc_name: 'tumor_only' upload: dir: ../final

or

Tumor run template


details:

  • algorithm: aligner: false mark_duplicates: false recalibrate: false remove_lcr: false variantcaller: mutect2 ensemble: numpass: 2 variant_regions: /shared/pipeline-user/run_data/baits/sureselectv7/S31285117_Padded.bed vcfanno: [somatic] tools_on:
    • damage_filter analysis: variant2 genome_build: hg38 description: TD1717_test
  • resources: mutect2: options:
    • --max-mnp-distance
    • 0 fc_date: '2020-02-10' fc_name: 'tumor_only' upload: dir: ../final
waemm commented 4 years ago

@naumenko-sa @roryk Guys, can anyone help me with this? I am sure it must be something simple but I have not managed to get this working. Thanks!

naumenko-sa commented 4 years ago

Hi Warren @waemm!

I think I'm following your footsteps in PureCN user story.

In bcbio, we require phenotype: tumor for mutect2, and a normal or panel of normals. For tumor-only projects we are using vardict.

So, currently, it is not possible to create a PON for small variants in bcbio with mutect2. I do it outside of bcbio with:

# $1 = sample_N.bam
# #2 = panel.interval_list

# -tumor is deprecated

bname=`basename $1 .bam`

gatk Mutect2 \
-R /data/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa \
-I $1 \
-O $bname.vcf.gz \
--max-mnp-distance 0 \
--intervals $2 \
--interval-padding 50 \
--germline-resource af-only-gnomad.hg38.vcf.gz \
--genotype-germline-sites

tabix -f $bname.vcf.gz
vcf_files=""
for f in *.for_pon.vcf.gz
do 
    vcf_files="$vcf_files -V $f"
done

gatk3 -Xmx12g \
-T CombineVariants \
--minimumN 3 \
-R /data/genomes/Hsapiens/hg38/seq/hg38.fa \
-o snv_pon.vcf \
$vcf_files

bgzip snv_pon.vcf
tabix snv_pon.vcf.gz

I'm trying to bring it to bcbio as a part of PureCN effort.

There is a complication that a PON generated by CreateSomaticPanelOfNormals does not work for PureCN, so we might need two sorts of PONs here in bcbio.

S.

waemm commented 4 years ago

@naumenko-sa Thanks Sergey, yes I was more or less following this approach. Would be great to see it in bcbio!