Open matomol opened 1 month ago
Sorry a typo. Not basecalling but variant calling, of course.
Hi @matomol, apologies for the delay in responding. Please can you provide a bit more information so I can assist you better - by 'snpid', do you mean the dbSNP identifier? We perform annotation with SnpEff as follows: first to add basic annotations, and then to annotate using ClinVar. The ClinVar VCF we use is out of date so we are in the process of updating that, but there won't be any dbSNP/rsIDs in the output VCFs as we are not using this dataset to annotate.
Please find below the summary that I did for just one gene, LDLR. A similar statistics is prepared for all the genes on the Illumina Panel once we succeded to lift it over successfully.
The ClinVar VCF we use is out of date so we are in the process of updating that Well maybe that will solve most of the problem.
Correctly annotated and with SNPID attached are only the two following SNPs in that region tested
snpid alleles reference alternatives 5930 (A, G) A (G,) 5927 (A, G) A (G,)
The following SNPIDs where correctly found by Illumina and Nanopore, but only Illumina attached the correct SNPID.
rs11669576 ANN = ('A|missense_variant|MODERATE|LDLR|LDLR|transcript|NM_000527.5|protein_coding|8/18|c.1171G>A|p.Ala391Thr|1257/5173|1171/2583|391/860||', 'A|missense_variant|MODERATE|LDLR|LDLR|transcript|XM_011528010.2|protein_coding|8/17|c.1171G>A|p.Ala391Thr|1288/5126|1171/2505|391/834||', 'A|missense_variant|MODERATE|LDLR|LDLR|transcript|NM_001195798.2|protein_coding|8/18|c.1171G>A|p.Ala391Thr|1257/5167|1171/2577|391/858||', 'A|missense_variant|MODERATE|LDLR|LDLR|transcript|NM_001195799.2|protein_coding|7/17|c.1048G>A|p.Ala350Thr|1134/5050|1048/2460|350/819||', 'A|missense_variant|MODERATE|LDLR|LDLR|transcript|NM_001195800.2|protein_coding|6/16|c.667G>A|p.Ala223Thr|753/4669|667/2079|223/692||', 'A|missense_variant|MODERATE|LDLR|LDLR|transcript|NM_001195803.2|protein_coding|7/16|c.790G>A|p.Ala264Thr|876/4639|790/2049|264/682||', 'A|upstream_gene_variant|MODIFIER|MIR6886|MIR6886|transcript|NR_106946.1|pseudogene||n.-1850G>A|||||1850|', 'A|upstream_gene_variant|MODIFIER|MIR6886|MIR6886|transcript|unassigned_transcript_3212|miRNA||n.-1855G>A|||||1855|', 'A|upstream_gene_variant|MODIFIER|MIR6886|MIR6886|transcript|unassigned_transcript_3213|miRNA||n.-1887G>A|||||1887|', 'A|non_coding_transcript_exon_variant|MODIFIER|LDLR|LDLR|transcript|XR_001753685.2|pseudogene|8/18|n.1288G>A||||||', 'A|non_coding_transcript_exon_variant|MODIFIER|LDLR|LDLR|transcript|XR_001753686.2|pseudogene|8/17|n.1288G>A||||||')
rs72658861 ANN = ('C|splice_region_variant&intron_variant|LOW|LDLR|LDLR|transcript|NM_000527.5|protein_coding|7/17|c.1061-8T>C||||||', 'C|splice_region_variant&intron_variant|LOW|LDLR|LDLR|transcript|XM_011528010.2|protein_coding|7/16|c.1061-8T>C||||||', 'C|splice_region_variant&intron_variant|LOW|LDLR|LDLR|transcript|XR_001753685.2|pseudogene|7/17|n.1178-8T>C||||||', 'C|splice_region_variant&intron_variant|LOW|LDLR|LDLR|transcript|XR_001753686.2|pseudogene|7/16|n.1178-8T>C||||||', 'C|splice_region_variant&intron_variant|LOW|LDLR|LDLR|transcript|NM_001195798.2|protein_coding|7/17|c.1061-8T>C||||||', 'C|splice_region_variant&intron_variant|LOW|LDLR|LDLR|transcript|NM_001195799.2|protein_coding|6/16|c.938-8T>C||||||', 'C|splice_region_variant&intron_variant|LOW|LDLR|LDLR|transcript|NM_001195800.2|protein_coding|5/15|c.557-8T>C||||||', 'C|splice_region_variant&intron_variant|LOW|LDLR|LDLR|transcript|NM_001195803.2|protein_coding|6/15|c.680-8T>C||||||', 'C|upstream_gene_variant|MODIFIER|MIR6886|MIR6886|transcript|NR_106946.1|pseudogene||n.-1968T>C|||||1968|', 'C|upstream_gene_variant|MODIFIER|MIR6886|MIR6886|transcript|unassigned_transcript_3212|miRNA||n.-1973T>C|||||1973|', 'C|upstream_gene_variant|MODIFIER|MIR6886|MIR6886|transcript|unassigned_transcript_3213|miRNA||n.-2005T>C|||||2005|')
rs45508991 ANN = ('T|missense_variant|MODERATE|LDLR|LDLR|transcript|NM_000527.5|protein_coding|15/18|c.2177C>T|p.Thr726Ile|2263/5173|2177/2583|726/860||', 'T|missense_variant|MODERATE|LDLR|LDLR|transcript|XM_011528010.2|protein_coding|15/17|c.2177C>T|p.Thr726Ile|2294/5126|2177/2505|726/834||', 'T|missense_variant|MODERATE|LDLR|LDLR|transcript|NM_001195798.2|protein_coding|15/18|c.2177C>T|p.Thr726Ile|2263/5167|2177/2577|726/858||', 'T|missense_variant|MODERATE|LDLR|LDLR|transcript|NM_001195799.2|protein_coding|14/17|c.2054C>T|p.Thr685Ile|2140/5050|2054/2460|685/819||', 'T|missense_variant|MODERATE|LDLR|LDLR|transcript|NM_001195800.2|protein_coding|13/16|c.1673C>T|p.Thr558Ile|1759/4669|1673/2079|558/692||', 'T|missense_variant|MODERATE|LDLR|LDLR|transcript|NM_001195803.2|protein_coding|13/16|c.1643C>T|p.Thr548Ile|1729/4639|1643/2049|548/682||', 'T|non_coding_transcript_exon_variant|MODIFIER|LDLR|LDLR|transcript|XR_001753685.2|pseudogene|15/18|n.2511C>T||||||', 'T|non_coding_transcript_exon_variant|MODIFIER|LDLR|LDLR|transcript|XR_001753686.2|pseudogene|14/17|n.2154C>T||||||')
Lastly, this SNP was not detected by Nanopore.
rs2738442 not detected by nNanopore sequencing SNP NM_000527.5:c.1060+7C>A
Operating System
Ubuntu 22.04
Other Linux
No response
Workflow Version
latest
Workflow Execution
Command line (Local)
Other workflow execution
No response
EPI2ME Version
No response
CLI command run
Workflow Execution - CLI Execution Profile
standard (default)
What happened?
I performed an analysis of the snp.vcf file and realized that although the variantes are correctly annotated the SNPID is missing. This is in particular true for SNPID with higher numbers, so that I assume that an outdated SNP reference database is used.
Relevant log output
Application activity log entry
Were you able to successfully run the latest version of the workflow with the demo data?
yes
Other demo data information