Closed mvelinder closed 3 years ago
Hi Matt, ik spreek het een beetje. you'll need to do joint-calling for most tools (including slivar) to work as expected. you can do this with glnexus as per the docs for DeepTrio: https://github.com/google/deepvariant/blob/r1.2/docs/deeptrio-wgs-case-study.md#merge-vcfs-using-glnexus
Thanks Brent!
I got this working using a Singularity-based approach on our Utah HPC. I normalized the merged VCF with bcftools norm -m - -w 10000 -f $fasta $mergedvcf -O z -o $normvcf
and the variant record in question now looks like this
chrX 18604572 chrX_18604572_C_T C T 9 . AF=0.166667;AQ=9 GT:DP:AD:GQ:PL:RNC 0/0:23:23,0:48:0,93,929:.. 0/0:29:29,0:50:0,87,869:.. 0/1:56:31,25:10:9,0,34:..
However, slivar is still not prioritizing it as a de novo (using this single variant record test VCF). Here's what I'm running and my output:
+ vcf=trio_merged.vcf.gz.norm.vcf.gz.denovo.test.vcf.gz
+ ped=A1167-210621-VAR-Botto-ASIA.ped
+ fasta=/scratch/ucgd/lustre/work/u0691312/reference/ucgd_reference/GRCh38/human_g1k_v38_decoy_phix.fasta
+ gnomad=/scratch/ucgd/lustre-work/marth/u0691312/reference/gnomad_3.1/gnomad.genomes.v3.1.sites.slivar.zip
+ js=/uufs/chpc.utah.edu/common/HIPAA/u0691312/bin/slivar/js/slivar-functions.js
+ slivar_static expr --vcf trio_merged.vcf.gz.norm.vcf.gz.denovo.test.vcf.gz --ped A1167-210621-VAR-Botto-ASIA.ped --pass-only --js /uufs/chpc.utah.edu/common/HIPAA/u0691312/bin/slivar/js/slivar-functions.js -g /scratch/ucgd/lustre-work/marth/u0691312/reference/gnomad_3.1/gnomad.genomes.v3.1.sites.slivar.zip --info 'INFO.gnomad_popmax_af < 0.01 && variant.FILTER == "PASS" && variant.ALT[0] != "*"' --family-expr 'denovo:fam.every(segregating_denovo) && INFO.gnomad_popmax_af < 0.001' --family-expr 'x_denovo:(variant.CHROM == "X" || variant.CHROM == "chrX") && fam.every(segregating_denovo_x) && INFO.gnomad_popmax_af < 0.001' --family-expr 'recessive:fam.every(segregating_recessive)' --family-expr 'dominant:fam.every(segregating_dominant)' -o trio_merged.vcf.gz.norm.vcf.gz.denovo.test.vcf.gz.slivar.vcf.gz
> slivar version: 0.2.2 186b862063ce50ee1d282bc610196630c4ecac61
[slivar] 3 samples matched in VCF and PED to be evaluated
[slivar] message for /scratch/ucgd/lustre-work/marth/u0691312/reference/gnomad_3.1/gnomad.genomes.v3.1.sites.slivar.zip:
> created on:2020-12-12
[slivar] Finished. evaluated 1 total variants and wrote 0 variants that passed your slivar expressions.
sample denovo x_denovo recessive dominant
UDN492951 0 0 0 0
UDN652881 0 0 0 0
UDN776853 0 0 0 0
Any ideas?
what's in your pedigree file?
also, just to clarify, that's a different variant that has been joint-called. The first one is just a merged GVCF. So make sure you're applying the slivar expressions to the correct VCFs
Here's the DeepTrio test VCF and the ped file contents. Without the VCF header.
$ bcftools view -H trio_merged.vcf.gz.norm.vcf.gz.denovo.test.vcf.gz
chrX 18604572 chrX_18604572_C_T C T 9 . AF=0.166667;AQ=9 GT:DP:AD:GQ:PL:RNC 0/0:23:23,0:48:0,93,929:.. 0/0:29:29,0:50:0,87,869:.. 0/1:56:31,25:10:9,0,34:..
$ cat A1167-210621-VAR-Botto-ASIA.ped
#Kindred_ID Sample_ID Paternal_ID Maternal_ID Sex Affection_Status Project
UDN_X1 UDN492951 0 0 1 1 A1167-210621-VAR-Botto-ASIA
UDN_X1 UDN652881 0 0 2 1 A1167-210621-VAR-Botto-ASIA
UDN_X1 UDN776853 UDN492951 UDN652881 2 2 A1167-210621-VAR-Botto-ASIA
Here's a GATK test VCF and the same ped file as above, where I do get the variant prioritized
$ bcftools view -H A1167-210621-VAR-Botto-ASIA_Final_1624376012.vcf.gz.norm.vcf.gz.denovo.test.vcf.gz
chrX 18604572 . C T 564.83 PASS AC=1;AF=0.00172;AN=580;AS_BaseQRankSum=3.104;AS_FS=1.024;AS_InbreedingCoeff=-0.0023;AS_MQ=60;AS_MQRankSum=0;AS_QD=10.27;AS_ReadPosRankSum=-0.963;AS_SOR=0.532;BaseQRankSum=3.104;ClippingRankSum=0;DP=7382;ExcessHet=3.0103;FS=1.024;InbreedingCoeff=-0.0023;MLEAC=1;MLEAF=0.00172;MQ=60;MQ0=0;MQRankSum=0;QD=10.27;ReadPosRankSum=-0.963;SOR=0.532;VQSLOD=3.4898;culprit=DP;CSQ=T|stop_gained|HIGH|CDKL5|ENSG00000008086|Transcript|ENST00000379989|protein_coding|13/22||||1933|1648|550|R/*|Cga/Tga|rs267608643&CM081204&COSM1625757||1||SNV|HGNC|HGNC:11411|YES||1||CCDS14186.1|ENSP00000369325|O76039||UPI0000136103|1|||PANTHER:PTHR24056&PANTHER:PTHR24056:SF111&MobiDB_lite:mobidb-lite&MobiDB_lite:mobidb-lite|||||||||||||||||||||pathogenic|0&0&1|1&1&1|18063413&21318334||||||0.128|,T|stop_gained|HIGH|CDKL5|ENSG00000008086|Transcript|ENST00000379996|protein_coding|12/21||||1923|1648|550|R/*|Cga/Tga|rs267608643&CM081204&COSM1625757||1||SNV|HGNC|HGNC:11411|||1||CCDS14186.1|ENSP00000369332|O76039||UPI0000136103|1|||PANTHER:PTHR24056&PANTHER:PTHR24056:SF111&MobiDB_lite:mobidb-lite&MobiDB_lite:mobidb-lite|||||||||||||||||||||pathogenic|0&0&1|1&1&1|18063413&21318334||||||0.128|,T|stop_gained|HIGH|CDKL5|ENSG00000008086|Transcript|ENST00000463994|protein_coding|12/16||||1926|1648|550|R/*|Cga/Tga|rs267608643&CM081204&COSM1625757||1|cds_end_NF|SNV|HGNC|HGNC:11411|||5|||ENSP00000485184||A0A096LNR9|UPI0004F2369C|1|||PANTHER:PTHR24056:SF111&PANTHER:PTHR24056&MobiDB_lite:mobidb-lite&MobiDB_lite:mobidb-lite|||||||||||||||||||||pathogenic|0&0&1|1&1&1|18063413&21318334||||||0.128|,T|stop_gained|HIGH|CDKL5|ENSG00000008086|Transcript|ENST00000623535|protein_coding|11/17||||1648|1648|550|R/*|Cga/Tga|rs267608643&CM081204&COSM1625757||1||SNV|HGNC|HGNC:11411|||1|P1|CCDS83458.1|ENSP00000485244|O76039||UPI0002390615|1|||PANTHER:PTHR24056&PANTHER:PTHR24056:SF111&MobiDB_lite:mobidb-lite&MobiDB_lite:mobidb-lite|||||||||||||||||||||pathogenic|0&0&1|1&1&1|18063413&21318334||||||0.128|,T|stop_gained|HIGH|CDKL5|ENSG00000008086|Transcript|ENST00000635828|protein_coding|12/18||||2014|1648|550|R/*|Cga/Tga|rs267608643&CM081204&COSM1625757||1||SNV|HGNC|HGNC:11411|||5|||ENSP00000490170||A0A1B0GUM4|UPI0007E52D32|1|||PANTHER:PTHR24056:SF111&PANTHER:PTHR24056&MobiDB_lite:mobidb-lite&MobiDB_lite:mobidb-lite|||||||||||||||||||||pathogenic|0&0&1|1&1&1|18063413&21318334||||||0.128|,T|downstream_gene_variant|MODIFIER|CDKL5|ENSG00000008086|Transcript|ENST00000637881|protein_coding||||||||||rs267608643&CM081204&COSM1625757|235|1|cds_end_NF|SNV|HGNC|HGNC:11411|||5|||ENSP00000489879||A0A1B0GTX4|UPI0007E52C7A|1||||||||||||||||||||||||pathogenic|0&0&1|1&1&1|18063413&21318334||||||0.128| GT:AD:DP:GQ:PL:SAC 0/0:27,0:27:64:0,64,691:. 0/0:38,0:38:99:0,104,1158:. 0/1:30,25:55:99:611,0,658:13,17,12,13
And the output from slivar:
+ vcf=A1167-210621-VAR-Botto-ASIA_Final_1624376012.vcf.gz.norm.vcf.gz.denovo.test.vcf.gz
+ ped=A1167-210621-VAR-Botto-ASIA.ped
+ fasta=/scratch/ucgd/lustre/work/u0691312/reference/ucgd_reference/GRCh38/human_g1k_v38_decoy_phix.fasta
+ gnomad=/scratch/ucgd/lustre-work/marth/u0691312/reference/gnomad_3.1/gnomad.genomes.v3.1.sites.slivar.zip
+ js=/uufs/chpc.utah.edu/common/HIPAA/u0691312/bin/slivar/js/slivar-functions.js
+ slivar_static expr --vcf A1167-210621-VAR-Botto-ASIA_Final_1624376012.vcf.gz.norm.vcf.gz.denovo.test.vcf.gz --ped A1167-210621-VAR-Botto-ASIA.ped --pass-only --js /uufs/chpc.utah.edu/common/HIPAA/u0691312/bin/slivar/js/slivar-functions.js -g /scratch/ucgd/lustre-work/marth/u0691312/reference/gnomad_3.1/gnomad.genomes.v3.1.sites.slivar.zip --info 'INFO.gnomad_popmax_af < 0.01 && variant.FILTER == "PASS" && variant.ALT[0] != "*"' --family-expr 'denovo:fam.every(segregating_denovo) && INFO.gnomad_popmax_af < 0.001' --family-expr 'x_denovo:(variant.CHROM == "X" || variant.CHROM == "chrX") && fam.every(segregating_denovo_x) && INFO.gnomad_popmax_af < 0.001' --family-expr 'recessive:fam.every(segregating_recessive)' --family-expr 'dominant:fam.every(segregating_dominant)' -o A1167-210621-VAR-Botto-ASIA_Final_1624376012.vcf.gz.norm.vcf.gz.denovo.test.vcf.gz.slivar.vcf.gz
> slivar version: 0.2.2 186b862063ce50ee1d282bc610196630c4ecac61
[slivar] 3 samples matched in VCF and PED to be evaluated
[slivar] message for /scratch/ucgd/lustre-work/marth/u0691312/reference/gnomad_3.1/gnomad.genomes.v3.1.sites.slivar.zip:
> created on:2020-12-12
[slivar] Finished. evaluated 1 total variants and wrote 1 variants that passed your slivar expressions.
sample denovo x_denovo recessive dominant
UDN492951 0 0 0 0
UDN652881 0 0 0 0
UDN776853 1 1 0 0
So it seems DeepTrio VCF specific, or maybe there's something in the slivar-functions.js that is causing it to be filtered out?
In the first vcf line you pasted, the kid has a GQ of 10. slivar default cutoff is 20.
Yep, that was it. I manually changed the variant GQ to 25 and it worked.
chrX 18604572 chrX_18604572_C_T C T 9 . AF=0.166667;AQ=9 GT:DP:AD:GQ:PL:RNC 0/0:23:23,0:48:0,93,929:.. 0/0:29:29,0:50:0,87,869:.. 0/1:56:31,25:25:9,0,34:..
+ vcf=trio_merged.vcf.gz.norm.vcf.gz.denovo.test.vcf.gz
+ ped=A1167-210621-VAR-Botto-ASIA.ped
+ fasta=/scratch/ucgd/lustre/work/u0691312/reference/ucgd_reference/GRCh38/human_g1k_v38_decoy_phix.fasta
+ gnomad=/scratch/ucgd/lustre-work/marth/u0691312/reference/gnomad_3.1/gnomad.genomes.v3.1.sites.slivar.zip
+ js=/uufs/chpc.utah.edu/common/HIPAA/u0691312/bin/slivar/js/slivar-functions.js
+ slivar_static expr --vcf trio_merged.vcf.gz.norm.vcf.gz.denovo.test.vcf.gz --ped A1167-210621-VAR-Botto-ASIA.ped --pass-only --js /uufs/chpc.utah.edu/common/HIPAA/u0691312/bin/slivar/js/slivar-functions.js -g /scratch/ucgd/lustre-work/marth/u0691312/reference/gnomad_3.1/gnomad.genomes.v3.1.sites.slivar.zip --info 'INFO.gnomad_popmax_af < 0.01 && variant.FILTER == "PASS" && variant.ALT[0] != "*"' --family-expr 'denovo:fam.every(segregating_denovo) && INFO.gnomad_popmax_af < 0.001' --family-expr 'x_denovo:(variant.CHROM == "X" || variant.CHROM == "chrX") && fam.every(segregating_denovo_x) && INFO.gnomad_popmax_af < 0.001' --family-expr 'recessive:fam.every(segregating_recessive)' --family-expr 'dominant:fam.every(segregating_dominant)' -o trio_merged.vcf.gz.norm.vcf.gz.denovo.test.vcf.gz.slivar.vcf.gz
> slivar version: 0.2.2 186b862063ce50ee1d282bc610196630c4ecac61
[W::hts_idx_load3] The index file is older than the data file: trio_merged.vcf.gz.norm.vcf.gz.denovo.test.vcf.gz.tbi
[slivar] 3 samples matched in VCF and PED to be evaluated
[slivar] message for /scratch/ucgd/lustre-work/marth/u0691312/reference/gnomad_3.1/gnomad.genomes.v3.1.sites.slivar.zip:
> created on:2020-12-12
[slivar] Finished. evaluated 1 total variants and wrote 1 variants that passed your slivar expressions.
sample denovo x_denovo recessive dominant
UDN492951 0 0 0 0
UDN652881 0 0 0 0
UDN776853 1 1 0 0
I'll consider dropping that GQ threshold in my slivar-functions.js - thanks dude!
Spreekt u Engels? Hope you're well friend!
I'm trying to use slivar for DeepTrio calls. They are called individually as GVCFs. I then merged them using
bcftools merge -0
and ran slivar using the standard approach for rare disease with theslivar-functions.js
and my expressions--family-expr 'denovo:fam.every(segregating_denovo) && INFO.gnomad_popmax_af < 0.001'
and--family-expr 'x_denovo:(variant.CHROM == "X" || variant.CHROM == "chrX") && fam.every(segregating_denovo_x) && INFO.gnomad_popmax_af < 0.001'
The variant in question is a de novo (also called by GATK - which is where we originally found it). Here's the VCF record for the merged trio VCF
Even with the 0/0 genotypes in both parents this doesn't meet my slivar criteria. Is this because of the missing/NULL fields for GT:GQ:DP:AD that the .js funtctions are looking for?
Here's the full output, you can ignore the compound het failures since it's not an annotated VCF.
Any help would be great! Or I'd be happy to work together to make a .js and/or command for best practices for DeepTrio calls.