Closed matthdsm closed 5 years ago
Dear @matthdsm,
I understand that the CSQ field is not the easier to parse/read, however I believe the CSQ field was designed like this for at least 2 reasons in the VCF output: 1) To keep the VEP annotations separated from the input VCF annotations (and also to avoid overwritting some common annotations). 2) A variant can overlap more than one transcript, so it would be very messy if the keys in the INFO field are duplicated.
Furthermore we also have other output formats:
However we will try to improve the VEP VCF output to make it easier to parse/read.
Best regards, Laurent
Hi Laurent,
Thanks for the reply. I'm aware of the different output options, but I'm using existing vcf files as source, so those are no option for me. Do you have any suggestions for python libraries to parse the CSQ per transcript? Any help would be greatly appreciated.
Thanks again. M
As for your remarks, I suppose
VEP_
or CSQ_
to avoid confusion and overwriting other INFO fields.For example
CSQ_FEATURE="transcript1 | transcript2" CSQ_CANONICAL="true|false"
Cheers M
Also, could you help me understand the data in the CSQ tag? My header seems allright, but I've got about 20x more values in the CSQ field then there are in the header..
example:
##INFO=<ID=CSQ,Number=.,Type=String,Description="Consequence annotations from Ensembl VEP. Format: Allele|Consequence|IMPACT|SYMBOL|Gene|Feature_type|Feature|BIOTYPE|EXON|INTRON|HGVSc|HGVSp|cDNA_position|CDS_position|Protein_position|Amino_acids|Codons|Existing_variation|DISTANCE|STRAND|FLAGS|VARIANT_CLASS|SYMBOL_SOURCE|HGNC_ID|CANONICAL|TSL|APPRIS|CCDS|ENSP|SWISSPROT|TREMBL|UNIPARC|REFSEQ_MATCH|SOURCE|GIVEN_REF|USED_REF|GENE_PHENO|SIFT|PolyPhen|DOMAINS|HGVS_OFFSET|AF|AFR_AF|AMR_AF|EAS_AF|EUR_AF|SAS_AF|AA_AF|EA_AF|gnomAD_AF|gnomAD_AFR_AF|gnomAD_AMR_AF|gnomAD_ASJ_AF|gnomAD_EAS_AF|gnomAD_FIN_AF|gnomAD_NFE_AF|gnomAD_OTH_AF|gnomAD_SAS_AF|MAX_AF|MAX_AF_POPS|CLIN_SIG|SOMATIC|PHENO|PUBMED|MOTIF_NAME|MOTIF_POS|HIGH_INF_POS|MOTIF_SCORE_CHANGE|MaxEntScan_alt|MaxEntScan_diff|MaxEntScan_ref|SpliceRegion">
##VEP="v90" time="2018-10-11 18:15:54" cache="/home/galaxy/bcbio/genomes/Hsapiens/hg38/vep/homo_sapiens_merged/90_GRCh38" ensembl-io=90.9a148ea ensembl=90.4a44397 ensembl-funcgen=90.e775c00 ensembl-variation=90.e9e7027 1000genomes="phase3" COSMIC="81" ClinVar="201706" ESP="V2-SSA137" HGMD-PUBLIC="20164" assembly="GRCh38.p10" dbSNP="150" gencode="GENCODE 27" genebuild="2014-07" gnomAD="170228" polyphen="2.2.2" refseq="2016-08-03 11:43:09 - GCF_000001405.34_GRCh38.p8_genomic.gff" regbuild="16" sift="sift5.2.2"
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1 Sample2 Sample3
chr1 62494430 rs10889335 A G 2056.9 PASS AC=3;AF=0.5;AN=6;BaseQRankSum=-0.919;ClippingRankSum=0;DB;DP=159;ExcessHet=6.9897;FS=2.119;MLEAC=3;MLEAF=0.5;MQ=60;MQ0=0;MQRankSum=0;QD=13.02;ReadPosRankSum=0.087;SOR=0.511;CSQ=G|synonymous_variant|
LOW|DOCK7|ENSG00000116641|Transcript|ENST00000251157|protein_coding|40/50||ENST00000251157.10:c.5035T>C|ENSP00000251157.6:p.Leu1679%3D|5035/7084|5035/6396|1679/2131|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190|YES|5|A2|CCDS81338.1|ENSP00000251157|Q96N67||UPI0000EADF24|
|Ensembl|A|A|1|||PROSITE_profiles:PS51651&hmmpanther:PTHR23317:SF78&hmmpanther:PTHR23317&Pfam_domain:PF06920||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|s
ynonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000340370|protein_coding|39/49||ENST00000340370.10:c.4969T>C|ENSP00000340742.5:p.Leu1657%3D|4969/6731|4969/6330|1657/2109|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||2|P3|CCDS30734.1|ENSP00000340742|Q96N67||UPI000044FEA9||Ensembl|A|A|1|||PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78&Pfam_domain:PF06920||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000454575|protein_coding|40/49||ENST00000454575.6:c.5035T>C|ENSP00000413583.2:p.Leu1679%3D|5046/6985|5035/6390|1679/2129|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||1|A2|CCDS60156.1|ENSP00000413583|Q96N67||UPI0000E45660||Ensembl|A|A|1|||PROSITE_profiles:PS51651&hmmpanther:PTHR23317:SF78&hmmpanther:PTHR23317&Pfam_domain:PF06920||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|non_coding_transcript_exon_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000479983|retained_intron|1/2||ENST00000479983.1:n.869T>C||869/1566|||||rs10889335||-1||SNV|HGNC|HGNC:19190||2||||||||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000634264|protein_coding|39/49||ENST00000634264.1:c.4942T>C|ENSP00000489284.1:p.Leu1648%3D|4942/6303|4942/6303|1648/2100|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||5|A2|CCDS81336.1|ENSP00000489284|Q96N67||UPI0000EADF23||Ensembl|A|A|1|||Pfam_domain:PF06920&PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|upstream_gene_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000635088|nonsense_mediated_decay||||||||||rs10889335|1670|-1|cds_start_NF|SNV|HGNC|HGNC:19190||5|||ENSP00000489412||A0A0U1RR97|UPI000719A171||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000635123|protein_coding|39/48||ENST00000635123.1:c.4942T>C|ENSP00000489499.1:p.Leu1648%3D|4942/6297|4942/6297|1648/2098|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||5|A2|CCDS81335.1|ENSP00000489499|Q96N67||UPI000022AE77||Ensembl|A|A|1|||PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78&Pfam_domain:PF06920||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000635253|protein_coding|40/50||ENST00000635253.1:c.5062T>C|ENSP00000489124.1:p.Leu1688%3D|5062/6423|5062/6423|1688/2140|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||5|A2||ENSP00000489124|Q96N67||UPI0000EADF22||Ensembl|A|A|1|||Pfam_domain:PF06920&PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|non_coding_transcript_exon_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000635983|retained_intron|7/16||ENST00000635983.1:n.1472T>C||1472/6506|||||rs10889335||-1||SNV|HGNC|HGNC:19190||5||||||||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|downstream_gene_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000637144|processed_transcript||||||||||rs10889335|1417|-1||SNV|HGNC|HGNC:19190||5||||||||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|3_prime_UTR_variant&NMD_transcript_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000637208|nonsense_mediated_decay|39/43||ENST00000637208.1:c.*3155T>C||5044/8283|||||rs10889335||-1||SNV|HGNC|HGNC:19190||5|||ENSP00000490079||A0A1B0GUE9|UPI0007E52CCB||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000637255|protein_coding|20/29||ENST00000637255.1:c.2335T>C|ENSP00000490888.1:p.Leu779%3D|2335/4071|2335/3690|779/1229|L|Ttg/Ctg|rs10889335||-1|cds_start_NF|SNV|HGNC|HGNC:19190||5|||ENSP00000490888||A0A1B0GWE0|UPI0007E52B47||Ensembl|A|A|1|||Pfam_domain:PF06920&PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_001271999.1|protein_coding|40/49||NM_001271999.1:c.5035T>C|NP_001258928.1:p.Leu1679%3D|5139/7182|5035/6390|1679/2129|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|YES||||NP_001258928.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_001272000.1|protein_coding|39/49||NM_001272000.1:c.4942T>C|NP_001258929.1:p.Leu1648%3D|5046/7095|4942/6303|1648/2100|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||NP_001258929.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_001272001.1|protein_coding|39/48||NM_001272001.1:c.4942T>C|NP_001258930.1:p.Leu1648%3D|5046/7089|4942/6297|1648/2098|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||NP_001258930.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_033407.3|protein_coding|39/49||NM_033407.3:c.4969T>C|NP_212132.2:p.Leu1657%3D|5073/7122|4969/6330|1657/2109|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||NP_212132.2||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_005271292.2|protein_coding|40/50||XM_005271292.2:c.5035T>C|XP_005271349.1:p.Leu1679%3D|5129/6893|5035/6396|1679/2131|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_005271349.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542326.2|protein_coding|40/50||XM_011542326.2:c.5062T>C|XP_011540628.1:p.Leu1688%3D|5156/6920|5062/6423|1688/2140|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540628.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542327.2|protein_coding|40/49||XM_011542327.2:c.5062T>C|XP_011540629.1:p.Leu1688%3D|5156/6914|5062/6417|1688/2138|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540629.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542328.2|protein_coding|40/49||XM_011542328.2:c.5062T>C|XP_011540630.1:p.Leu1688%3D|5156/6905|5062/6408|1688/2135|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540630.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542330.2|protein_coding|40/44||XM_011542330.2:c.5062T>C|XP_011540632.1:p.Leu1688%3D|5156/5696|5062/5535|1688/1844|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540632.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_017002639.1|protein_coding|39/48||XM_017002639.1:c.4969T>C|XP_016858128.1:p.Leu1657%3D|5063/6821|4969/6324|1657/2107|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_016858128.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_017002640.1|protein_coding|40/44||XM_017002640.1:c.5062T>C|XP_016858129.1:p.Leu1688%3D|5156/6147|5062/5592|1688/1863|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_016858129.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||;an_EOG=1900;ac_EOG=A:1279&G:621;gc_hom_ref_EOG=436;gc_het_alt_EOG=407;gc_hom_alt_EOG=107 GT:AD:DP:GQ:PL 0/1:30,32:62:99:828,0,851 0/1:40,29:69:99:793,0,1068 0/1:10,17:27:99:466,0,360
Thanks again. M
Dear @matthdsm,
About your result, this is because you have 23 distinct predictions for rs10889335
(Ensembl transcripts + RefSeq transcripts as you are using the merged dataset).
You can have a glimpse of it on this page on the Ensembl website (missing the RefSeq data).
The CSQ field contains annotations on the 23 predictions (separated by a comma):
G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000340370|protein_coding|39/49||ENST00000340370.10:c.4969T>C|ENSP00000340742.5:p.Leu1657%3D|4969/6731|4969/6330|1657/2109|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||2|P3|CCDS30734.1|ENSP00000340742|Q96N67||UPI000044FEA9||Ensembl|A|A|1|||PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78&Pfam_domain:PF06920||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000454575|protein_coding|40/49||ENST00000454575.6:c.5035T>C|ENSP00000413583.2:p.Leu1679%3D|5046/6985|5035/6390|1679/2129|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||1|A2|CCDS60156.1|ENSP00000413583|Q96N67||UPI0000E45660||Ensembl|A|A|1|||PROSITE_profiles:PS51651&hmmpanther:PTHR23317:SF78&hmmpanther:PTHR23317&Pfam_domain:PF06920||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|non_coding_transcript_exon_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000479983|retained_intron|1/2||ENST00000479983.1:n.869T>C||869/1566|||||rs10889335||-1||SNV|HGNC|HGNC:19190||2||||||||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000634264|protein_coding|39/49||ENST00000634264.1:c.4942T>C|ENSP00000489284.1:p.Leu1648%3D|4942/6303|4942/6303|1648/2100|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||5|A2|CCDS81336.1|ENSP00000489284|Q96N67||UPI0000EADF23||Ensembl|A|A|1|||Pfam_domain:PF06920&PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|upstream_gene_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000635088|nonsense_mediated_decay||||||||||rs10889335|1670|-1|cds_start_NF|SNV|HGNC|HGNC:19190||5|||ENSP00000489412||A0A0U1RR97|UPI000719A171||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000635123|protein_coding|39/48||ENST00000635123.1:c.4942T>C|ENSP00000489499.1:p.Leu1648%3D|4942/6297|4942/6297|1648/2098|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||5|A2|CCDS81335.1|ENSP00000489499|Q96N67||UPI000022AE77||Ensembl|A|A|1|||PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78&Pfam_domain:PF06920||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000635253|protein_coding|40/50||ENST00000635253.1:c.5062T>C|ENSP00000489124.1:p.Leu1688%3D|5062/6423|5062/6423|1688/2140|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||5|A2||ENSP00000489124|Q96N67||UPI0000EADF22||Ensembl|A|A|1|||Pfam_domain:PF06920&PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|non_coding_transcript_exon_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000635983|retained_intron|7/16||ENST00000635983.1:n.1472T>C||1472/6506|||||rs10889335||-1||SNV|HGNC|HGNC:19190||5||||||||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|downstream_gene_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000637144|processed_transcript||||||||||rs10889335|1417|-1||SNV|HGNC|HGNC:19190||5||||||||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|3_prime_UTR_variant&NMD_transcript_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000637208|nonsense_mediated_decay|39/43||ENST00000637208.1:c.*3155T>C||5044/8283|||||rs10889335||-1||SNV|HGNC|HGNC:19190||5|||ENSP00000490079||A0A1B0GUE9|UPI0007E52CCB||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000637255|protein_coding|20/29||ENST00000637255.1:c.2335T>C|ENSP00000490888.1:p.Leu779%3D|2335/4071|2335/3690|779/1229|L|Ttg/Ctg|rs10889335||-1|cds_start_NF|SNV|HGNC|HGNC:19190||5|||ENSP00000490888||A0A1B0GWE0|UPI0007E52B47||Ensembl|A|A|1|||Pfam_domain:PF06920&PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_001271999.1|protein_coding|40/49||NM_001271999.1:c.5035T>C|NP_001258928.1:p.Leu1679%3D|5139/7182|5035/6390|1679/2129|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|YES||||NP_001258928.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_001272000.1|protein_coding|39/49||NM_001272000.1:c.4942T>C|NP_001258929.1:p.Leu1648%3D|5046/7095|4942/6303|1648/2100|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||NP_001258929.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_001272001.1|protein_coding|39/48||NM_001272001.1:c.4942T>C|NP_001258930.1:p.Leu1648%3D|5046/7089|4942/6297|1648/2098|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||NP_001258930.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_033407.3|protein_coding|39/49||NM_033407.3:c.4969T>C|NP_212132.2:p.Leu1657%3D|5073/7122|4969/6330|1657/2109|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||NP_212132.2||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_005271292.2|protein_coding|40/50||XM_005271292.2:c.5035T>C|XP_005271349.1:p.Leu1679%3D|5129/6893|5035/6396|1679/2131|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_005271349.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542326.2|protein_coding|40/50||XM_011542326.2:c.5062T>C|XP_011540628.1:p.Leu1688%3D|5156/6920|5062/6423|1688/2140|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540628.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542327.2|protein_coding|40/49||XM_011542327.2:c.5062T>C|XP_011540629.1:p.Leu1688%3D|5156/6914|5062/6417|1688/2138|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540629.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542328.2|protein_coding|40/49||XM_011542328.2:c.5062T>C|XP_011540630.1:p.Leu1688%3D|5156/6905|5062/6408|1688/2135|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540630.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542330.2|protein_coding|40/44||XM_011542330.2:c.5062T>C|XP_011540632.1:p.Leu1688%3D|5156/5696|5062/5535|1688/1844|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540632.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_017002639.1|protein_coding|39/48||XM_017002639.1:c.4969T>C|XP_016858128.1:p.Leu1657%3D|5063/6821|4969/6324|1657/2107|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_016858128.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_017002640.1|protein_coding|40/44||XM_017002640.1:c.5062T>C|XP_016858129.1:p.Leu1688%3D|5156/6147|5062/5592|1688/1863|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_016858129.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||;an_EOG=1900;ac_EOG=A:1279&G:621;gc_hom_ref_EOG=436;gc_het_alt_EOG=407;gc_hom_alt_EOG=107
I hope this makes sense.
Concerning your earlier comments:
Best regards, Laurent
Oooh, now I get it! The CSQ tag doesn't seem to repeat the allele per transcript! That's what was messing up my parsing.
Allrighty, got it now. Thanks for the help! M
PS: I'd appreciate it if you could take up my proposal with the dev team. I think this would make like a whole lot easier for a lot of people.
Hi,
Would it be possible to add a flag which expands the CSQ field in separate INFO fields as per the VCF standard? The CSQ tag can become quite confusing and hard to parse as the number of annotations rises. This would make for easier parsing downstream.
I suppose this is a feature that could be very useful for the community, as a quick google search show a lot of third party packages aiming to do just that.
Thanks a lot.
Cheers M