Closed Wasya-the-Wolf closed 3 months ago
Hi @Wasya-the-Wolf ,
When you say "could not be called", does that mean the variant is absent from the VCF or it's a REFCALL? The thresholds are set in the fist step of DeepVariant, which is make_examples
and any candidates passing frequency thresholds are then run through the CNN for genotyping.
Hi @Wasya-the-Wolf ,
When you say "could not be called", does that mean the variant is absent from the VCF or it's a REFCALL? The thresholds are set in the fist step of DeepVariant, which is
make_examples
and any candidates passing frequency thresholds are then run through the CNN for genotyping.
Thank you very much for providing useful information. May I ask what parameters do I need to set in make_example to adjust the VAF threshold?
@Wasya-the-Wolf , the default parameters should be able to call variants with high sensitivity. Can you please explain this part of your question:
When you say "could not be called", does that mean the variant is absent from the VCF or it's a REFCALL?
@Wasya-the-Wolf , the default parameters should be able to call variants with high sensitivity. Can you please explain this part of your question:
When you say "could not be called", does that mean the variant is absent from the VCF or it's a REFCALL?
Yes, it is absent from the VCF, not REFCALL. I have checked the raw vcf files, and it turned out that these variants did not appear in my output.
@Wasya-the-Wolf ,
I'd suggest using:
--make_examples_extra_args "vsc_min_fraction_indels=0.10,vsc_min_fraction_snps=0.10"
And set it to your desired fraction. Although by default it low for WES
which means those variants should appear in the output. But you can put a small value and see if you can rescue some of these variants.
@Wasya-the-Wolf ,
I'd suggest using:
--make_examples_extra_args "vsc_min_fraction_indels=0.10,vsc_min_fraction_snps=0.10"
And set it to your desired fraction. Although by default it low for
WES
which means those variants should appear in the output. But you can put a small value and see if you can rescue some of these variants.
Thank you very much for your help! Before adding parameters, I have a small question: What does the vsc_min_fraction
parameter do?
time docker run -it \
google/deepvariant:1.6.1 \
/opt/deepvariant/bin/make_examples --helpfull | grep 'vsc_min_fraction_snps' -A 5
Shows:
--vsc_min_fraction_snps: SNP alleles occurring at least this fraction of all
counts in our AlleleCount will be advanced as candidates.
(default: '0.12')
You can look at the full set of parameters by removing grep. Please also consider seeing how deepvariant works and the DeepVariant manuscript: https://www.nature.com/articles/nbt.4235 for more details.
time docker run -it \ google/deepvariant:1.6.1 \ /opt/deepvariant/bin/make_examples --helpfull | grep 'vsc_min_fraction_snps' -A 5
Shows:
--vsc_min_fraction_snps: SNP alleles occurring at least this fraction of all counts in our AlleleCount will be advanced as candidates. (default: '0.12')
You can look at the full set of parameters by removing grep. Please also consider seeing how deepvariant works and the DeepVariant manuscript: https://www.nature.com/articles/nbt.4235 for more details.
ok,thank you!
I also recommend you reading this: https://github.com/google/deepvariant/blob/r1.6.1/docs/FAQ.md#why-does-deepvariant-not-call-a-specific-variant-in-my-data.
I will close this issue for now. Please reopen if you have further questions.
Hello, Thanks for this fast and useful germline calling tool. When I used DeepVariant 1.6.0 for single sample WES germline calling, I found that some real germline mutations with VAF (variant allele frequency) values less than 0.3 to 0.4 could not be called. IGV view figures of these variants are below. May I ask if DeepVariant considers VAF parameters during runtime or sets threshold filtering for VAF parameters? We look forward to your reply. Our Codes(All variables have been defined):
singularity run \ -B "${INPUT_DIR}":"/input","${OUTPUT_DIR}":"/output" \ deepvariant_1.6.0.sif \ /opt/deepvariant/bin/run_deepvariant \ --model_type=WES \ --ref=/input/testinput/human_g1k_v37_modified.fasta \ --reads=/input/${i}.sorted.markdup.BQSR.bam \ --regions /input/testinput/use_agilent_region_padding_100.bed \ --output_vcf=/output/${i}.vcf.gz \ --output_gvcf=/output/${i}.g.vcf.gz \ --intermediate_results_dir /output/intermediate_results_dir/${i} \ --num_shards=10
IGV figures: