Ensembl / ensembl-vep

The Ensembl Variant Effect Predictor predicts the functional effects of genomic variants
https://www.ensembl.org/vep
Apache License 2.0
445 stars 151 forks source link

Feature request: CSQ expansion #306

Closed matthdsm closed 5 years ago

matthdsm commented 5 years ago

Hi,

Would it be possible to add a flag which expands the CSQ field in separate INFO fields as per the VCF standard? The CSQ tag can become quite confusing and hard to parse as the number of annotations rises. This would make for easier parsing downstream.

I suppose this is a feature that could be very useful for the community, as a quick google search show a lot of third party packages aiming to do just that.

Thanks a lot.

Cheers M

ens-lgil commented 5 years ago

Dear @matthdsm,

I understand that the CSQ field is not the easier to parse/read, however I believe the CSQ field was designed like this for at least 2 reasons in the VCF output: 1) To keep the VEP annotations separated from the input VCF annotations (and also to avoid overwritting some common annotations). 2) A variant can overlap more than one transcript, so it would be very messy if the keys in the INFO field are duplicated.

Furthermore we also have other output formats:

However we will try to improve the VEP VCF output to make it easier to parse/read.

Best regards, Laurent

matthdsm commented 5 years ago

Hi Laurent,

Thanks for the reply. I'm aware of the different output options, but I'm using existing vcf files as source, so those are no option for me. Do you have any suggestions for python libraries to parse the CSQ per transcript? Any help would be greatly appreciated.

Thanks again. M

matthdsm commented 5 years ago

As for your remarks, I suppose

  1. vep annotated INFO fields could be prefixed with VEP_ or CSQ_ to avoid confusion and overwriting other INFO fields.
  2. Annotations for different transcripts could be delimited inside one field in a similar fashion as the CSQ field is now.

For example

INFO=

INFO=

CSQ_FEATURE="transcript1 | transcript2" CSQ_CANONICAL="true|false"

Cheers M

matthdsm commented 5 years ago

Also, could you help me understand the data in the CSQ tag? My header seems allright, but I've got about 20x more values in the CSQ field then there are in the header..

example:

##INFO=<ID=CSQ,Number=.,Type=String,Description="Consequence annotations from Ensembl VEP. Format: Allele|Consequence|IMPACT|SYMBOL|Gene|Feature_type|Feature|BIOTYPE|EXON|INTRON|HGVSc|HGVSp|cDNA_position|CDS_position|Protein_position|Amino_acids|Codons|Existing_variation|DISTANCE|STRAND|FLAGS|VARIANT_CLASS|SYMBOL_SOURCE|HGNC_ID|CANONICAL|TSL|APPRIS|CCDS|ENSP|SWISSPROT|TREMBL|UNIPARC|REFSEQ_MATCH|SOURCE|GIVEN_REF|USED_REF|GENE_PHENO|SIFT|PolyPhen|DOMAINS|HGVS_OFFSET|AF|AFR_AF|AMR_AF|EAS_AF|EUR_AF|SAS_AF|AA_AF|EA_AF|gnomAD_AF|gnomAD_AFR_AF|gnomAD_AMR_AF|gnomAD_ASJ_AF|gnomAD_EAS_AF|gnomAD_FIN_AF|gnomAD_NFE_AF|gnomAD_OTH_AF|gnomAD_SAS_AF|MAX_AF|MAX_AF_POPS|CLIN_SIG|SOMATIC|PHENO|PUBMED|MOTIF_NAME|MOTIF_POS|HIGH_INF_POS|MOTIF_SCORE_CHANGE|MaxEntScan_alt|MaxEntScan_diff|MaxEntScan_ref|SpliceRegion">
##VEP="v90" time="2018-10-11 18:15:54" cache="/home/galaxy/bcbio/genomes/Hsapiens/hg38/vep/homo_sapiens_merged/90_GRCh38" ensembl-io=90.9a148ea ensembl=90.4a44397 ensembl-funcgen=90.e775c00 ensembl-variation=90.e9e7027 1000genomes="phase3" COSMIC="81" ClinVar="201706" ESP="V2-SSA137" HGMD-PUBLIC="20164" assembly="GRCh38.p10" dbSNP="150" gencode="GENCODE 27" genebuild="2014-07" gnomAD="170228" polyphen="2.2.2" refseq="2016-08-03 11:43:09 - GCF_000001405.34_GRCh38.p8_genomic.gff" regbuild="16" sift="sift5.2.2"
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  Sample1        Sample2        Sample3
chr1    62494430        rs10889335      A       G       2056.9  PASS    AC=3;AF=0.5;AN=6;BaseQRankSum=-0.919;ClippingRankSum=0;DB;DP=159;ExcessHet=6.9897;FS=2.119;MLEAC=3;MLEAF=0.5;MQ=60;MQ0=0;MQRankSum=0;QD=13.02;ReadPosRankSum=0.087;SOR=0.511;CSQ=G|synonymous_variant|
LOW|DOCK7|ENSG00000116641|Transcript|ENST00000251157|protein_coding|40/50||ENST00000251157.10:c.5035T>C|ENSP00000251157.6:p.Leu1679%3D|5035/7084|5035/6396|1679/2131|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190|YES|5|A2|CCDS81338.1|ENSP00000251157|Q96N67||UPI0000EADF24|
|Ensembl|A|A|1|||PROSITE_profiles:PS51651&hmmpanther:PTHR23317:SF78&hmmpanther:PTHR23317&Pfam_domain:PF06920||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|s
ynonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000340370|protein_coding|39/49||ENST00000340370.10:c.4969T>C|ENSP00000340742.5:p.Leu1657%3D|4969/6731|4969/6330|1657/2109|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||2|P3|CCDS30734.1|ENSP00000340742|Q96N67||UPI000044FEA9||Ensembl|A|A|1|||PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78&Pfam_domain:PF06920||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000454575|protein_coding|40/49||ENST00000454575.6:c.5035T>C|ENSP00000413583.2:p.Leu1679%3D|5046/6985|5035/6390|1679/2129|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||1|A2|CCDS60156.1|ENSP00000413583|Q96N67||UPI0000E45660||Ensembl|A|A|1|||PROSITE_profiles:PS51651&hmmpanther:PTHR23317:SF78&hmmpanther:PTHR23317&Pfam_domain:PF06920||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|non_coding_transcript_exon_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000479983|retained_intron|1/2||ENST00000479983.1:n.869T>C||869/1566|||||rs10889335||-1||SNV|HGNC|HGNC:19190||2||||||||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000634264|protein_coding|39/49||ENST00000634264.1:c.4942T>C|ENSP00000489284.1:p.Leu1648%3D|4942/6303|4942/6303|1648/2100|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||5|A2|CCDS81336.1|ENSP00000489284|Q96N67||UPI0000EADF23||Ensembl|A|A|1|||Pfam_domain:PF06920&PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|upstream_gene_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000635088|nonsense_mediated_decay||||||||||rs10889335|1670|-1|cds_start_NF|SNV|HGNC|HGNC:19190||5|||ENSP00000489412||A0A0U1RR97|UPI000719A171||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000635123|protein_coding|39/48||ENST00000635123.1:c.4942T>C|ENSP00000489499.1:p.Leu1648%3D|4942/6297|4942/6297|1648/2098|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||5|A2|CCDS81335.1|ENSP00000489499|Q96N67||UPI000022AE77||Ensembl|A|A|1|||PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78&Pfam_domain:PF06920||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000635253|protein_coding|40/50||ENST00000635253.1:c.5062T>C|ENSP00000489124.1:p.Leu1688%3D|5062/6423|5062/6423|1688/2140|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||5|A2||ENSP00000489124|Q96N67||UPI0000EADF22||Ensembl|A|A|1|||Pfam_domain:PF06920&PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|non_coding_transcript_exon_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000635983|retained_intron|7/16||ENST00000635983.1:n.1472T>C||1472/6506|||||rs10889335||-1||SNV|HGNC|HGNC:19190||5||||||||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|downstream_gene_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000637144|processed_transcript||||||||||rs10889335|1417|-1||SNV|HGNC|HGNC:19190||5||||||||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|3_prime_UTR_variant&NMD_transcript_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000637208|nonsense_mediated_decay|39/43||ENST00000637208.1:c.*3155T>C||5044/8283|||||rs10889335||-1||SNV|HGNC|HGNC:19190||5|||ENSP00000490079||A0A1B0GUE9|UPI0007E52CCB||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000637255|protein_coding|20/29||ENST00000637255.1:c.2335T>C|ENSP00000490888.1:p.Leu779%3D|2335/4071|2335/3690|779/1229|L|Ttg/Ctg|rs10889335||-1|cds_start_NF|SNV|HGNC|HGNC:19190||5|||ENSP00000490888||A0A1B0GWE0|UPI0007E52B47||Ensembl|A|A|1|||Pfam_domain:PF06920&PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_001271999.1|protein_coding|40/49||NM_001271999.1:c.5035T>C|NP_001258928.1:p.Leu1679%3D|5139/7182|5035/6390|1679/2129|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|YES||||NP_001258928.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_001272000.1|protein_coding|39/49||NM_001272000.1:c.4942T>C|NP_001258929.1:p.Leu1648%3D|5046/7095|4942/6303|1648/2100|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||NP_001258929.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_001272001.1|protein_coding|39/48||NM_001272001.1:c.4942T>C|NP_001258930.1:p.Leu1648%3D|5046/7089|4942/6297|1648/2098|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||NP_001258930.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_033407.3|protein_coding|39/49||NM_033407.3:c.4969T>C|NP_212132.2:p.Leu1657%3D|5073/7122|4969/6330|1657/2109|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||NP_212132.2||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_005271292.2|protein_coding|40/50||XM_005271292.2:c.5035T>C|XP_005271349.1:p.Leu1679%3D|5129/6893|5035/6396|1679/2131|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_005271349.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542326.2|protein_coding|40/50||XM_011542326.2:c.5062T>C|XP_011540628.1:p.Leu1688%3D|5156/6920|5062/6423|1688/2140|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540628.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542327.2|protein_coding|40/49||XM_011542327.2:c.5062T>C|XP_011540629.1:p.Leu1688%3D|5156/6914|5062/6417|1688/2138|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540629.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542328.2|protein_coding|40/49||XM_011542328.2:c.5062T>C|XP_011540630.1:p.Leu1688%3D|5156/6905|5062/6408|1688/2135|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540630.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542330.2|protein_coding|40/44||XM_011542330.2:c.5062T>C|XP_011540632.1:p.Leu1688%3D|5156/5696|5062/5535|1688/1844|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540632.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_017002639.1|protein_coding|39/48||XM_017002639.1:c.4969T>C|XP_016858128.1:p.Leu1657%3D|5063/6821|4969/6324|1657/2107|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_016858128.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_017002640.1|protein_coding|40/44||XM_017002640.1:c.5062T>C|XP_016858129.1:p.Leu1688%3D|5156/6147|5062/5592|1688/1863|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_016858129.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||;an_EOG=1900;ac_EOG=A:1279&G:621;gc_hom_ref_EOG=436;gc_het_alt_EOG=407;gc_hom_alt_EOG=107     GT:AD:DP:GQ:PL  0/1:30,32:62:99:828,0,851       0/1:40,29:69:99:793,0,1068      0/1:10,17:27:99:466,0,360

Thanks again. M

ens-lgil commented 5 years ago

Dear @matthdsm,

About your result, this is because you have 23 distinct predictions for rs10889335 (Ensembl transcripts + RefSeq transcripts as you are using the merged dataset). You can have a glimpse of it on this page on the Ensembl website (missing the RefSeq data).

The CSQ field contains annotations on the 23 predictions (separated by a comma):

G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000340370|protein_coding|39/49||ENST00000340370.10:c.4969T>C|ENSP00000340742.5:p.Leu1657%3D|4969/6731|4969/6330|1657/2109|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||2|P3|CCDS30734.1|ENSP00000340742|Q96N67||UPI000044FEA9||Ensembl|A|A|1|||PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78&Pfam_domain:PF06920||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000454575|protein_coding|40/49||ENST00000454575.6:c.5035T>C|ENSP00000413583.2:p.Leu1679%3D|5046/6985|5035/6390|1679/2129|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||1|A2|CCDS60156.1|ENSP00000413583|Q96N67||UPI0000E45660||Ensembl|A|A|1|||PROSITE_profiles:PS51651&hmmpanther:PTHR23317:SF78&hmmpanther:PTHR23317&Pfam_domain:PF06920||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|non_coding_transcript_exon_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000479983|retained_intron|1/2||ENST00000479983.1:n.869T>C||869/1566|||||rs10889335||-1||SNV|HGNC|HGNC:19190||2||||||||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000634264|protein_coding|39/49||ENST00000634264.1:c.4942T>C|ENSP00000489284.1:p.Leu1648%3D|4942/6303|4942/6303|1648/2100|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||5|A2|CCDS81336.1|ENSP00000489284|Q96N67||UPI0000EADF23||Ensembl|A|A|1|||Pfam_domain:PF06920&PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|upstream_gene_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000635088|nonsense_mediated_decay||||||||||rs10889335|1670|-1|cds_start_NF|SNV|HGNC|HGNC:19190||5|||ENSP00000489412||A0A0U1RR97|UPI000719A171||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000635123|protein_coding|39/48||ENST00000635123.1:c.4942T>C|ENSP00000489499.1:p.Leu1648%3D|4942/6297|4942/6297|1648/2098|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||5|A2|CCDS81335.1|ENSP00000489499|Q96N67||UPI000022AE77||Ensembl|A|A|1|||PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78&Pfam_domain:PF06920||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000635253|protein_coding|40/50||ENST00000635253.1:c.5062T>C|ENSP00000489124.1:p.Leu1688%3D|5062/6423|5062/6423|1688/2140|L|Ttg/Ctg|rs10889335||-1||SNV|HGNC|HGNC:19190||5|A2||ENSP00000489124|Q96N67||UPI0000EADF22||Ensembl|A|A|1|||Pfam_domain:PF06920&PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|non_coding_transcript_exon_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000635983|retained_intron|7/16||ENST00000635983.1:n.1472T>C||1472/6506|||||rs10889335||-1||SNV|HGNC|HGNC:19190||5||||||||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|downstream_gene_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000637144|processed_transcript||||||||||rs10889335|1417|-1||SNV|HGNC|HGNC:19190||5||||||||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|3_prime_UTR_variant&NMD_transcript_variant|MODIFIER|DOCK7|ENSG00000116641|Transcript|ENST00000637208|nonsense_mediated_decay|39/43||ENST00000637208.1:c.*3155T>C||5044/8283|||||rs10889335||-1||SNV|HGNC|HGNC:19190||5|||ENSP00000490079||A0A1B0GUE9|UPI0007E52CCB||Ensembl|A|A|1|||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|ENSG00000116641|Transcript|ENST00000637255|protein_coding|20/29||ENST00000637255.1:c.2335T>C|ENSP00000490888.1:p.Leu779%3D|2335/4071|2335/3690|779/1229|L|Ttg/Ctg|rs10889335||-1|cds_start_NF|SNV|HGNC|HGNC:19190||5|||ENSP00000490888||A0A1B0GWE0|UPI0007E52B47||Ensembl|A|A|1|||Pfam_domain:PF06920&PROSITE_profiles:PS51651&hmmpanther:PTHR23317&hmmpanther:PTHR23317:SF78||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_001271999.1|protein_coding|40/49||NM_001271999.1:c.5035T>C|NP_001258928.1:p.Leu1679%3D|5139/7182|5035/6390|1679/2129|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|YES||||NP_001258928.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_001272000.1|protein_coding|39/49||NM_001272000.1:c.4942T>C|NP_001258929.1:p.Leu1648%3D|5046/7095|4942/6303|1648/2100|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||NP_001258929.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_001272001.1|protein_coding|39/48||NM_001272001.1:c.4942T>C|NP_001258930.1:p.Leu1648%3D|5046/7089|4942/6297|1648/2098|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||NP_001258930.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|NM_033407.3|protein_coding|39/49||NM_033407.3:c.4969T>C|NP_212132.2:p.Leu1657%3D|5073/7122|4969/6330|1657/2109|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||NP_212132.2||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_005271292.2|protein_coding|40/50||XM_005271292.2:c.5035T>C|XP_005271349.1:p.Leu1679%3D|5129/6893|5035/6396|1679/2131|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_005271349.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542326.2|protein_coding|40/50||XM_011542326.2:c.5062T>C|XP_011540628.1:p.Leu1688%3D|5156/6920|5062/6423|1688/2140|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540628.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542327.2|protein_coding|40/49||XM_011542327.2:c.5062T>C|XP_011540629.1:p.Leu1688%3D|5156/6914|5062/6417|1688/2138|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540629.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542328.2|protein_coding|40/49||XM_011542328.2:c.5062T>C|XP_011540630.1:p.Leu1688%3D|5156/6905|5062/6408|1688/2135|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540630.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_011542330.2|protein_coding|40/44||XM_011542330.2:c.5062T>C|XP_011540632.1:p.Leu1688%3D|5156/5696|5062/5535|1688/1844|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_011540632.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_017002639.1|protein_coding|39/48||XM_017002639.1:c.4969T>C|XP_016858128.1:p.Leu1657%3D|5063/6821|4969/6324|1657/2107|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_016858128.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||,
G|synonymous_variant|LOW|DOCK7|85440|Transcript|XM_017002640.1|protein_coding|40/44||XM_017002640.1:c.5062T>C|XP_016858129.1:p.Leu1688%3D|5156/6147|5062/5592|1688/1863|L|Ttg/Ctg|rs10889335||-1||SNV|EntrezGene|HGNC:19190|||||XP_016858129.1||||rseq_mrna_match|RefSeq|A|A||||||0.3496|0.4228|0.3573|0.1815|0.3171|0.4519|0.3917|0.3337|0.3442|0.4006|0.393|0.2334|0.1952|0.2831|0.3366|0.3142|0.4592|0.4592|gnomAD_SAS|||1|21347282||||||||;an_EOG=1900;ac_EOG=A:1279&G:621;gc_hom_ref_EOG=436;gc_het_alt_EOG=407;gc_hom_alt_EOG=107

I hope this makes sense.

Concerning your earlier comments:

Best regards, Laurent

matthdsm commented 5 years ago

Oooh, now I get it! The CSQ tag doesn't seem to repeat the allele per transcript! That's what was messing up my parsing.

Allrighty, got it now. Thanks for the help! M

PS: I'd appreciate it if you could take up my proposal with the dev team. I think this would make like a whole lot easier for a lot of people.