TheJacksonLaboratory / LIRICAL

LIkelihood Ratio Interpretation of Clinical AbnormaLities
https://thejacksonlaboratory.github.io/LIRICAL/stable
Other
22 stars 11 forks source link

Vcf2GenotypeMap gives error "Could not identify gene" #528

Closed oleraj closed 4 years ago

oleraj commented 4 years ago

Hi,

When I run LIRICAL I get a "Could not identify gene" error, repeated for every variant. See snippet below:

AX747861[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128709105, ref='T', alt='C', geneSymbol='KCNJ1', geneId='', variantEffect=MISSENSE_VARIANT, annotations=[TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6297412A>G', hgvsCdna='c.1034A>G', hgvsProtein='p.(Asp345Gly)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6297412A>G', hgvsCdna='c.1091A>G', hgvsProtein='p.(Asp364Gly)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6297412A>G', hgvsCdna='c.1034A>G', hgvsProtein='p.(Asp345Gly)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6297412A>G', hgvsCdna='c.1085A>G', hgvsProtein='p.(Asp362Gly)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6297412A>G', hgvsCdna='c.1034A>G', hgvsProtein='p.(Asp345Gly)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6297412A>G', hgvsCdna='c.1034A>G', hgvsProtein='p.(Asp345Gly)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128709126, ref='A', alt='G', geneSymbol='KCNJ1', geneId='', variantEffect=MISSENSE_VARIANT, annotations=[TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6297391T>C', hgvsCdna='c.1013T>C', hgvsProtein='p.(Met338Thr)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6297391T>C', hgvsCdna='c.1070T>C', hgvsProtein='p.(Met357Thr)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6297391T>C', hgvsCdna='c.1013T>C', hgvsProtein='p.(Met338Thr)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6297391T>C', hgvsCdna='c.1064T>C', hgvsProtein='p.(Met355Thr)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6297391T>C', hgvsCdna='c.1013T>C', hgvsProtein='p.(Met338Thr)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6297391T>C', hgvsCdna='c.1013T>C', hgvsProtein='p.(Met338Thr)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128709137, ref='A', alt='AG', geneSymbol='KCNJ1', geneId='', variantEffect=FRAMESHIFT_VARIANT, annotations=[TranscriptAnnotation{variantEffect=FRAMESHIFT_VARIANT, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6297379_6297380insC', hgvsCdna='c.1001dup', hgvsProtein='p.(His335Serfs*8)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=FRAMESHIFT_VARIANT, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6297379_6297380insC', hgvsCdna='c.1058dup', hgvsProtein='p.(His354Serfs*8)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=FRAMESHIFT_VARIANT, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6297379_6297380insC', hgvsCdna='c.1001dup', hgvsProtein='p.(His335Serfs*8)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=FRAMESHIFT_VARIANT, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6297379_6297380insC', hgvsCdna='c.1052dup', hgvsProtein='p.(His352Serfs*8)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=FRAMESHIFT_VARIANT, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6297379_6297380insC', hgvsCdna='c.1001dup', hgvsProtein='p.(His335Serfs*8)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=FRAMESHIFT_VARIANT, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6297379_6297380insC', hgvsCdna='c.1001dup', hgvsProtein='p.(His335Serfs*8)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128709302, ref='A', alt='G', geneSymbol='KCNJ1', geneId='', variantEffect=SYNONYMOUS_VARIANT, annotations=[TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6297215T>C', hgvsCdna='c.837T>C', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6297215T>C', hgvsCdna='c.894T>C', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6297215T>C', hgvsCdna='c.837T>C', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6297215T>C', hgvsCdna='c.888T>C', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6297215T>C', hgvsCdna='c.837T>C', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6297215T>C', hgvsCdna='c.837T>C', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128709341, ref='G', alt='T', geneSymbol='KCNJ1', geneId='', variantEffect=SYNONYMOUS_VARIANT, annotations=[TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6297176C>A', hgvsCdna='c.798C>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6297176C>A', hgvsCdna='c.855C>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6297176C>A', hgvsCdna='c.798C>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6297176C>A', hgvsCdna='c.849C>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6297176C>A', hgvsCdna='c.798C>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6297176C>A', hgvsCdna='c.798C>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128709434, ref='G', alt='A', geneSymbol='KCNJ1', geneId='', variantEffect=SYNONYMOUS_VARIANT, annotations=[TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6297083C>T', hgvsCdna='c.705C>T', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6297083C>T', hgvsCdna='c.762C>T', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6297083C>T', hgvsCdna='c.705C>T', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6297083C>T', hgvsCdna='c.756C>T', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6297083C>T', hgvsCdna='c.705C>T', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6297083C>T', hgvsCdna='c.705C>T', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128709493, ref='CTG', alt='C', geneSymbol='KCNJ1', geneId='', variantEffect=FRAMESHIFT_TRUNCATION, annotations=[TranscriptAnnotation{variantEffect=FRAMESHIFT_TRUNCATION, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6297022_6297023delCA', hgvsCdna='c.644_645del', hgvsProtein='p.(Thr215Serfs*4)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=FRAMESHIFT_TRUNCATION, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6297022_6297023delCA', hgvsCdna='c.701_702del', hgvsProtein='p.(Thr234Serfs*4)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=FRAMESHIFT_TRUNCATION, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6297022_6297023delCA', hgvsCdna='c.644_645del', hgvsProtein='p.(Thr215Serfs*4)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=FRAMESHIFT_TRUNCATION, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6297022_6297023delCA', hgvsCdna='c.695_696del', hgvsProtein='p.(Thr232Serfs*4)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=FRAMESHIFT_TRUNCATION, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6297022_6297023delCA', hgvsCdna='c.644_645del', hgvsProtein='p.(Thr215Serfs*4)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=FRAMESHIFT_TRUNCATION, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6297022_6297023delCA', hgvsCdna='c.644_645del', hgvsProtein='p.(Thr215Serfs*4)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128709538, ref='G', alt='A', geneSymbol='KCNJ1', geneId='', variantEffect=MISSENSE_VARIANT, annotations=[TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6296979C>T', hgvsCdna='c.601C>T', hgvsProtein='p.(Leu201Phe)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6296979C>T', hgvsCdna='c.658C>T', hgvsProtein='p.(Leu220Phe)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6296979C>T', hgvsCdna='c.601C>T', hgvsProtein='p.(Leu201Phe)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6296979C>T', hgvsCdna='c.652C>T', hgvsProtein='p.(Leu218Phe)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6296979C>T', hgvsCdna='c.601C>T', hgvsProtein='p.(Leu201Phe)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6296979C>T', hgvsCdna='c.601C>T', hgvsProtein='p.(Leu201Phe)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128709562, ref='G', alt='A', geneSymbol='KCNJ1', geneId='', variantEffect=STOP_GAINED, annotations=[TranscriptAnnotation{variantEffect=STOP_GAINED, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6296955C>T', hgvsCdna='c.577C>T', hgvsProtein='p.(Arg193*)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=STOP_GAINED, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6296955C>T', hgvsCdna='c.634C>T', hgvsProtein='p.(Arg212*)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=STOP_GAINED, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6296955C>T', hgvsCdna='c.577C>T', hgvsProtein='p.(Arg193*)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=STOP_GAINED, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6296955C>T', hgvsCdna='c.628C>T', hgvsProtein='p.(Arg210*)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=STOP_GAINED, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6296955C>T', hgvsCdna='c.577C>T', hgvsProtein='p.(Arg193*)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=STOP_GAINED, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6296955C>T', hgvsCdna='c.577C>T', hgvsProtein='p.(Arg193*)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128709571, ref='G', alt='A', geneSymbol='KCNJ1', geneId='', variantEffect=MISSENSE_VARIANT, annotations=[TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6296946C>T', hgvsCdna='c.568C>T', hgvsProtein='p.(Leu190Phe)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6296946C>T', hgvsCdna='c.625C>T', hgvsProtein='p.(Leu209Phe)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6296946C>T', hgvsCdna='c.568C>T', hgvsProtein='p.(Leu190Phe)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6296946C>T', hgvsCdna='c.619C>T', hgvsProtein='p.(Leu207Phe)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6296946C>T', hgvsCdna='c.568C>T', hgvsProtein='p.(Leu190Phe)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6296946C>T', hgvsCdna='c.568C>T', hgvsProtein='p.(Leu190Phe)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128709618, ref='G', alt='A', geneSymbol='KCNJ1', geneId='', variantEffect=MISSENSE_VARIANT, annotations=[TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6296899C>T', hgvsCdna='c.521C>T', hgvsProtein='p.(Thr174Met)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6296899C>T', hgvsCdna='c.578C>T', hgvsProtein='p.(Thr193Met)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6296899C>T', hgvsCdna='c.521C>T', hgvsProtein='p.(Thr174Met)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6296899C>T', hgvsCdna='c.572C>T', hgvsProtein='p.(Thr191Met)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6296899C>T', hgvsCdna='c.521C>T', hgvsProtein='p.(Thr174Met)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6296899C>T', hgvsCdna='c.521C>T', hgvsProtein='p.(Thr174Met)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128709644, ref='C', alt='G', geneSymbol='KCNJ1', geneId='', variantEffect=MISSENSE_VARIANT, annotations=[TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6296873G>C', hgvsCdna='c.495G>C', hgvsProtein='p.(Arg165Ser)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6296873G>C', hgvsCdna='c.552G>C', hgvsProtein='p.(Arg184Ser)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6296873G>C', hgvsCdna='c.495G>C', hgvsProtein='p.(Arg165Ser)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6296873G>C', hgvsCdna='c.546G>C', hgvsProtein='p.(Arg182Ser)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6296873G>C', hgvsCdna='c.495G>C', hgvsProtein='p.(Arg165Ser)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6296873G>C', hgvsCdna='c.495G>C', hgvsProtein='p.(Arg165Ser)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128709887, ref='C', alt='T', geneSymbol='KCNJ1', geneId='', variantEffect=SYNONYMOUS_VARIANT, annotations=[TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6296630G>A', hgvsCdna='c.252G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6296630G>A', hgvsCdna='c.309G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6296630G>A', hgvsCdna='c.252G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6296630G>A', hgvsCdna='c.303G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6296630G>A', hgvsCdna='c.252G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6296630G>A', hgvsCdna='c.252G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128709940, ref='T', alt='C', geneSymbol='KCNJ1', geneId='', variantEffect=MISSENSE_VARIANT, annotations=[TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6296577A>G', hgvsCdna='c.199A>G', hgvsProtein='p.(Thr67Ala)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6296577A>G', hgvsCdna='c.256A>G', hgvsProtein='p.(Thr86Ala)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6296577A>G', hgvsCdna='c.199A>G', hgvsProtein='p.(Thr67Ala)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6296577A>G', hgvsCdna='c.250A>G', hgvsProtein='p.(Thr84Ala)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6296577A>G', hgvsCdna='c.199A>G', hgvsProtein='p.(Thr67Ala)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6296577A>G', hgvsCdna='c.199A>G', hgvsProtein='p.(Thr67Ala)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128710094, ref='C', alt='T', geneSymbol='KCNJ1', geneId='', variantEffect=SYNONYMOUS_VARIANT, annotations=[TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6296423G>A', hgvsCdna='c.45G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6296423G>A', hgvsCdna='c.102G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6296423G>A', hgvsCdna='c.45G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6296423G>A', hgvsCdna='c.96G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6296423G>A', hgvsCdna='c.45G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6296423G>A', hgvsCdna='c.45G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "KCNJ1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128710122, ref='C', alt='T', geneSymbol='KCNJ1', geneId='', variantEffect=MISSENSE_VARIANT, annotations=[TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc021qsb.1', hgvsGenomic='g.6296395G>A', hgvsCdna='c.17G>A', hgvsProtein='p.(Arg6Gln)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeo.2', hgvsGenomic='g.6296395G>A', hgvsCdna='c.74G>A', hgvsProtein='p.(Arg25Gln)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qep.2', hgvsGenomic='g.6296395G>A', hgvsCdna='c.17G>A', hgvsProtein='p.(Arg6Gln)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qeq.2', hgvsGenomic='g.6296395G>A', hgvsCdna='c.68G>A', hgvsProtein='p.(Arg23Gln)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qer.2', hgvsGenomic='g.6296395G>A', hgvsCdna='c.17G>A', hgvsProtein='p.(Arg6Gln)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='KCNJ1', accession='uc001qes.2', hgvsGenomic='g.6296395G>A', hgvsCdna='c.17G>A', hgvsProtein='p.(Arg6Gln)', distanceFromNearestGene=-2147483648}]}

KCNJ1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "TP53AIP1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128807470, ref='G', alt='T', geneSymbol='TP53AIP1', geneId='', variantEffect=MISSENSE_VARIANT, annotations=[TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='TP53AIP1', accession='uc021qse.1', hgvsGenomic='g.6199047C>A', hgvsCdna='c.244C>A', hgvsProtein='p.(Gln82Lys)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='TP53AIP1', accession='uc001qex.3', hgvsGenomic='g.6199047C>A', hgvsCdna='c.244C>A', hgvsProtein='p.(Gln82Lys)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=CODING_TRANSCRIPT_INTRON_VARIANT, geneSymbol='TP53AIP1', accession='uc001qey.3', hgvsGenomic='g.6199047C>A', hgvsCdna='c.141+103C>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=CODING_TRANSCRIPT_INTRON_VARIANT, geneSymbol='TP53AIP1', accession='uc021qsc.1', hgvsGenomic='g.6199047C>A', hgvsCdna='c.141+103C>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=NON_CODING_TRANSCRIPT_EXON_VARIANT, geneSymbol='TP53AIP1', accession='uc009zcm.2', hgvsGenomic='g.6199047C>A', hgvsCdna='n.866C>A', hgvsProtein='', distanceFromNearestGene=-2147483648}]}

TP53AIP1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "TP53AIP1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128807494, ref='C', alt='T', geneSymbol='TP53AIP1', geneId='', variantEffect=MISSENSE_VARIANT, annotations=[TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='TP53AIP1', accession='uc021qse.1', hgvsGenomic='g.6199023G>A', hgvsCdna='c.220G>A', hgvsProtein='p.(Asp74Asn)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='TP53AIP1', accession='uc001qex.3', hgvsGenomic='g.6199023G>A', hgvsCdna='c.220G>A', hgvsProtein='p.(Asp74Asn)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=CODING_TRANSCRIPT_INTRON_VARIANT, geneSymbol='TP53AIP1', accession='uc001qey.3', hgvsGenomic='g.6199023G>A', hgvsCdna='c.141+79G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=CODING_TRANSCRIPT_INTRON_VARIANT, geneSymbol='TP53AIP1', accession='uc021qsc.1', hgvsGenomic='g.6199023G>A', hgvsCdna='c.141+79G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=NON_CODING_TRANSCRIPT_EXON_VARIANT, geneSymbol='TP53AIP1', accession='uc009zcm.2', hgvsGenomic='g.6199023G>A', hgvsCdna='n.842G>A', hgvsProtein='', distanceFromNearestGene=-2147483648}]}

TP53AIP1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "TP53AIP1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128807550, ref='C', alt='G', geneSymbol='TP53AIP1', geneId='', variantEffect=MISSENSE_VARIANT, annotations=[TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='TP53AIP1', accession='uc021qse.1', hgvsGenomic='g.6198967G>C', hgvsCdna='c.164G>C', hgvsProtein='p.(Arg55Pro)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=MISSENSE_VARIANT, geneSymbol='TP53AIP1', accession='uc001qex.3', hgvsGenomic='g.6198967G>C', hgvsCdna='c.164G>C', hgvsProtein='p.(Arg55Pro)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=CODING_TRANSCRIPT_INTRON_VARIANT, geneSymbol='TP53AIP1', accession='uc001qey.3', hgvsGenomic='g.6198967G>C', hgvsCdna='c.141+23G>C', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=CODING_TRANSCRIPT_INTRON_VARIANT, geneSymbol='TP53AIP1', accession='uc021qsc.1', hgvsGenomic='g.6198967G>C', hgvsCdna='c.141+23G>C', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=NON_CODING_TRANSCRIPT_EXON_VARIANT, geneSymbol='TP53AIP1', accession='uc009zcm.2', hgvsGenomic='g.6198967G>C', hgvsCdna='n.786G>C', hgvsProtein='', distanceFromNearestGene=-2147483648}]}

TP53AIP1[ERROR] Vcf2GenotypeMap - Vcf2GenotypeMap.javaorg.monarchinitiative.lirical.analysis.Vcf2GenotypeMap.vcf2genotypeMap(Vcf2GenotypeMap.java:175) - Could not identify gene "" with symbol "TP53AIP1" for variant VariantAnnotation{genomeAssembly=hg19, chromosome=11, chromosomeName='11', position=128807606, ref='C', alt='T', geneSymbol='TP53AIP1', geneId='', variantEffect=SYNONYMOUS_VARIANT, annotations=[TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='TP53AIP1', accession='uc021qse.1', hgvsGenomic='g.6198911G>A', hgvsCdna='c.108G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='TP53AIP1', accession='uc001qex.3', hgvsGenomic='g.6198911G>A', hgvsCdna='c.108G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='TP53AIP1', accession='uc001qey.3', hgvsGenomic='g.6198911G>A', hgvsCdna='c.108G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=SYNONYMOUS_VARIANT, geneSymbol='TP53AIP1', accession='uc021qsc.1', hgvsGenomic='g.6198911G>A', hgvsCdna='c.108G>A', hgvsProtein='p.(=)', distanceFromNearestGene=-2147483648}, TranscriptAnnotation{variantEffect=NON_CODING_TRANSCRIPT_EXON_VARIANT, geneSymbol='TP53AIP1', accession='uc009zcm.2', hgvsGenomic='g.6198911G>A', hgvsCdna='n.730G>A', hgvsProtein='', distanceFromNearestGene=-2147483648}]}

I'm using the latest jar posted (v1.0.3), genomeAssembly hg19, and pointing to exomiser path exomiser/exomiser-cli-12.1.0/data/1902_hg19

It appears LIRICAL is able to annotate the variants and it's grabbing the geneSymbol but perhaps not the gene ID. Any idea what is going on? Do I need to point "exomiser" in the YML file to a different version of the database?

Thanks,

Andrew

oleraj commented 4 years ago

An update, LIRICAL does produce output but it seems these error messages are a bit excessive and it makes it seem like it's not working properly.

pnrobinson commented 4 years ago

Thanks for the report. Can you provide the command line used to run LIRICAL? Are you using the very latest release (1.0.4)? We have fixed a number of minor issues in the last month or two

pnrobinson commented 4 years ago

It does seem LIRICAL is skipping KCNJ1, which is not good. There seems to be some data ingest problem. Could you send me the full command line parameters, and I will get to the bottom of this.

oleraj commented 4 years ago

Hi @pnrobinson, thanks for the reply. I don't see version 1.0.4 in the releases but I downloaded v1.0.3 which is the latest version I could see. If you have v.1.0.4 ready I can try that as well.

The command line I used was simply

java -Xmx16G -jar jar/LIRICAL.jar yaml -y proband.yml

The yaml file contents:

analysis:
  genomeAssembly: hg19
  vcf: proband.vcf.gz
  exomiser: /data/exomiser/exomiser-cli-12.1.0/data/1902_hg19
hpoIds: ['HP:0001875', 'HP:0001888', 'HP:0001973', 'HP:0002014', 'HP:0002024', 'HP:0002650', 'HP:0003765', 'HP:0004389', 'HP:0006532', 'HP:0007131', 'HP:0011108', 'HP:0100825']
prefix: proband_lirical

There's a total of ~2200 gene symbols that had this error for this particular VCF I used so this seems to be a pretty large problem.

oleraj commented 4 years ago

FYI, I also tried this with the 1909_hg19 version for the exomiser database and that gave the same result of ~2200 gene symbols not recognized.

pnrobinson commented 4 years ago

Thanks. I will try to reproduce this bug and fix it!

pnrobinson commented 4 years ago

I have narrowed down the source of these messages. LIRICAL uses Transcript files from the Jannovar tool (https://github.com/charite/jannovar) to identify the NCBI Gene ids for transcripts in which mutations have been found. Jannovar itself parses files from UCSC to obtain the links between its transcript models (e.g., uc002yqx.2), gene symbols, and Entrez ids. Jannovar does not need the ENtrez IDs and so it just reports the UCSC id (e.g. uc002yqx.2) and the gene symbol and returns and empty String for the geneid if it was not available. I think this is probably partially related to the fact that hg19 is now pretty old and we should all be updating to hg38. I think also that the majority of affected transcripts are pseudogenes and whatnot, although some of them are well established. Here is a selection:

AX748283 = Sequence 1808 from Patent EP1308459
GNRHR2 gonadotropin releasing hormone receptor 2 (pseudogene)
NAT8B N-acetyltransferase 8B (putative, gene/pseudogene)
AK094599 (discontinued in gene, LOC285058, ID: 285058   hypothetical gene supported by AK094599 [Homo sapiens (human)], replaced by ZNF806 zinc finger protein 806
BC038530= MYOSLID-AS1 MYOSLID antisense RNA, Gene ID: 101927865
AF055024=ASB1 ankyrin repeat and SOCS box containing 1, Gene ID: 51665
ZNF852=ZNF852 zinc finger protein 852, Gene ID: 285346
C3orf27= LINC01565 long intergenic non-protein coding RNA 1565 Gene ID: 23434
RBMXL1  NCBIGene:494115, This gene represents a retrogene of RNA binding motif protein, X-linked (RBMX), which is located on chromosome X. 
BC073807=Homo sapiens cDNA clone IMAGE:2905386, partial cds
CPN2 carboxypeptidase N subunit 2 [ Homo sapiens (human) ]Gene ID: 1370

I am going to investigate further, and also check whether there is an issue if hg38 is used. @julesjacobsen I am seeing that about 60% of these symbols are not in even the most recent gene-info file. 40% are and so I am wondering if updating the Exomiser version of the ser file might mitigate this issue?

oleraj commented 4 years ago

Thanks. FWIW, one of the genes skipped is RAG1, which is the candidate gene for this proband. And RAG1 is ranked in the top 5 genes for this proband when I analyze it with Exomiser, which I thought also uses Jannovar, correct?

pnrobinson commented 4 years ago

LIRICAL and Exomiser largely are separate code bases, and Exomiser handles Jannovar in a different way which is probably better in this case. I am now thinking that for hg19 it will be necessary to ingest additional Gene ID mappings, for instance from the Gene Info file. Thanks for your feedback which is invaluable to figure out issues like this. I will probably implement this by Thursday.

pnrobinson commented 4 years ago

@oleraj I have figured out what went wrong -- we fixed one bug in our upstream library phenol but introduced this bug. I am fixing it now and will check using a phenopacket whether we can get RAG1. There are still a number of genes that are not assigned Gene ids. These genes are shown with the following "symbols" (actually they are accession numbers, but this is what the data shows us). I do not think we need to worry about this for LIRICAL, since LIRICAL only shows known diseases (Exomiser in contrast can find novel diseases and this would be more of a worry)

BC008049
DQ586256
CR627135
AX746851
AK096159
AK056817
AF086203
DQ597025
AK124820
AL137445
BC150535
AL833150
NPIPL2
abParts
AX746840
AX746605
AX747935
BC127868
TVAS5
AX746991
CR627394
AC2
AK098438
AX746871
BC024306
DQ596274
AK055623
BC010030
BC073807
AF268386
CR627148
DQ588424
AK097590
F379
AB231705
JA760615
AK124970
LRRC37
AK123416
LOC283685
AX747981
BC046476
AX747985
BC036309
AX747988
hCPE-R
BC071797
PJCG1
JX088243
BX537921
DQ599242
BC127858
TCR-alpha
OK/SW-cl.16
C20orf195
DQ587039
BC038727
BC112340
DQ591415
JF934746
AK095151
C14orf169
AX746522
AX746765
AX747619
BC036435
BC139719
BC000869
BC038755
DQ592954
AF035281
AL832007
DQ577234
BC071770
CR936688
BC112004
AX747408
BC027619
AK123314
BC070322
BX648489
AK097794
LRRC21
FAM115C
AX721264
DKFZp667P0924
AX747516
AX747879
TCRA
BC071667
TUBB4Q
AK125860
BC037783
AL157440
BC064148
AF079515
AK125759
BC044655
AK021551
AX747662
AB000466
DQ581767
TCRBV7S1A1N2T
AL357213
DQ577251
FLJ00326
AK057873
DQ594967
TRNA_Pseudo
BC071802
LOC145757
HP06981
AK126616
AL390170
AX746564
AK123582
AY343902
BC034612
FLJ00313
CR936796
BC131758
AK027091
AK057766
AK057887
BC068609
D56494
AK302511
FAM109B
AK302879
AK000953
AF086184
AK090553
AK091889
AK097193
BC013821
C7orf62
AX747550
AK090681
AX747795
BC080605
KRT121P
AK124578
AK027190
BC131616
AK302694
LOC643699
BC047600
Y_RNA
BC043001
LOC339290
DQ573935
AX748314
AX747586
DHFRL1
BC082979
AK125558
AK311167
WASH1
AK125676
AK127974
U6atac
CR627049
LOC339047
AF161365
BC125159
AK126539
AK300387
BC104435
AK022793
AX748420
BC067347
AX747213
JA668105
AX747578
JA668106
AK123141
CR936633
BC032117
KIAA1653
POM121L7
BC024732
AK129926
BC035867
BCMS
AK091996
SNURF-SNRPN
BC101079
AL137655
AK021570
AK097142
BC023651
BC033456
DQ578285
BX537548
MRDS1
AK296148
DQ573949
BC153822
AX747590
BC043257
AX747594
AK124131
AX747598
AF131837
DQ786190
DL492607
BC027906
DD157417
AL050000
LOC23117
BC043601
FLJ00285
AF055024
AX747261
DJ031140
LOC541467
T-Cell Receptor V-alpha region
AK301968
BX647938
AK125237
AK123177
BC044939
BC039356
C12orf63
pp8961
AK057473
BC107108
BC040684
AK097814
AK097934
BC029787
BC215
AL832615
STRF6
AK309441
LPPR3
DQ586822
DQ786258
AK310665
BC063675
AK056396
BC043620
AL117485
AX748371
FKSG30
BC050399
AX747048
AB007962
L32131
BC031638
MTND5
AK126225
DKFZp779K0112
BC128043
ZRSR1
BC122864
AK298596
AK309533
AK309896
AX747158
DQ582448
AK093551
CR933665
TRNA_Cys
L37717
DQ582201
BC039000
BC039122
DQ582208
U4
AK095618
U4atac
BAC05914.1
BC021736
X97876
DQ575955
CCBP2
ZCCHC16
BC041342
AX747067
AK125397
BC029578
DQ583348
L77588
BC016035
BC044608
TCRBV6S3A1N1T
AX748380
BX538221
BC029473
MTERFD3
DQ574852
AK001351
AX748264
LOC402470
BC053679
AK093898
AK093412
DQ586985
BC039382
BX538226
DQ571461
DQ594001
AK096803
AK055017
AK057316
BC017255
AK311558
AK056351
AK022382
BC043541
AK055260
AK090593
BC031318
BC033739
BC034827
BX648926
RGAG4
BX161431
FAM46C
BC064339
AK056485
AX747192
C12orf28
AK056486
HM358976
HM358977
AX748283
FTSJD1
AK130400
AK127120
BC033989
DQ585755
BC030116
DQ597730
AK310441
FAM46D
C6orf195
BC040735
AK024141
AK023178
DKFZp434I138
UNQ2560
AK128346
BC052952
BC047373
AK126042
BC024248
DQ586415
AK096314
BX648502
BC019904
LOC338797
AK001394
AK308867
AX748067
AK092087
BC070391
AK302092
HOTAIR_5
BC038530
AK024243
AK094352
AK024248
AK094599
AL832447
AK097624
AK093264
TMEM246
AK128361
BC096759
BC040863
AK307870
AX748080
AD 1
AK095699
FLJ38723
BC040628
BC040869
AB231784
DQ586526
BC020917
BC038792
MTE
C3orf27
C1orf99
DQ580909
GTDC2
AK094692
BC024173
L26953
FLJ46481
BX640700
AK094577
LOC161527
AK131325
AK308561
AK310751
DQ656008
BC015342
cytochrome b
AK123947
AK025140
BC041855
JA760600
BC087858
BC031827
bK250D10.C22.8
AK308309
FW339974
BC160930
SYT14L
AK295707
BC032911
BC027448
AK097853
BC018860
DQ595055
DQ571187
DQ595299
DQ570096
C1orf81
AK309988
BC036055
BC036297
DQ587763
DQ599989
HIST1H2BK
SMA4
AK092251
AK097701
DQ574483
AX746830
BC031952
DKFZp451A185
C9orf169
FAM132B
AK092135
AK296947
DKFZp434K1323
BC039452
BX648763
DQ600234
BC035094
BC041879
FLJ00050
AK056081
AK131337
hCG_1980662
DQ658414
TRNA_Glu
BC106081
LOC494150
TRNA_Gly
oleraj commented 4 years ago

Great, glad to hear it! I'll test it when the new version is ready.

pnrobinson commented 4 years ago

Thanks, I will make a new release tomorrow. I just checked these symbols. There are 454 of them, not a single one is among the 4315 gene symbols that are associated with diseases in the HPO database. While this results is based on one VCF file, I think it is true that there are gene IDs for all of the disease related genes. Therefore, I will not output this as an error, but instead output these symbols as a list if users set the logging level to debug or trace.

pnrobinson commented 4 years ago

I ran LIRICAL with a phenopacket representing this case https://www.ncbi.nlm.nih.gov/pubmed/25849362 The correct diagnosis was listed in rank 1, and the posttest prob was nearly 100% Matthews-2015-RAG1-P1.json OMENN SYNDROME OMIM:603554 NCBIGene:5896 1 1 0.999994

I am going to clean up a few other things today (logging) and will make a release by tomorrow. Thanks again, @oleraj for pointing this out.

pnrobinson commented 4 years ago

I have just tested LIRICAL with either UCSC or RefSeq as a source of transcripts. RefSeq got 100% of the ids, and UCSC missed about 400 (the ones shown above). Therefore, I think that the code in LIRICAL is working, but we need to add some sort of warning message to show that transcripts were missed, and we need to add something about this to the readthedocs. I will do this and then I will release version 1.1, hopefully today or very soon.

pnrobinson commented 4 years ago

@oleraj I have just pushed a new release that addresses this issue

https://github.com/TheJacksonLaboratory/LIRICAL/releases/tag/v1.1.0

If symbols are found without IDs (as will be the case with UCSC), then a table is shown in the HTML output. This is not worrisome. It is also possible to run LIRICAL with refseq transcripts, all of which have a gene ID (because refseq is more conservative!).

https://github.com/TheJacksonLaboratory/LIRICAL/releases/tag/v1.1.0

I will close this for now, but would appreciate feedback -- hopefully this solves the issue, and I have confirmed that we can get RAG1 mutations.

oleraj commented 4 years ago

I tested the new release and I can confirm now that RAG1 is ranked as the top gene with 100% posterior probability for OMENN SYNDROME. Thanks for the fix!

I did notice a minor issue when testing that I will post in another issue.