konradjk / exac_browser

Browser for ExAC consortium data
http://exac.broadinstitute.org
MIT License
106 stars 54 forks source link

C2ORF15 vs C2orf15 #329

Open blajoie opened 6 years ago

blajoie commented 6 years ago

We (@vgainullin) have found an occurrence of case-sensitive gene names?

http://exac.broadinstitute.org/gene/ENSG00000273045 http://exac.broadinstitute.org/gene/ENSG00000241962

Both of these GENES exist in the fordist_cleaned_exac_r03_march16_z_pli_rec_null_data.txt.gz dump.

C2ORF15 looks to be invalid. Perhaps this is a bug? Affecting GNOMAD as well?

{ u'bp': 378, u'cds_end': 99767297, u'cds_start': 99763930, u'chr': u'2', u'exp_lof': 4.02128738, u'exp_mis': 31.9587584, u'exp_syn': 12.886011, u'gene': u'C2ORF15', u'lof_z': 0.50449294, u'mis_z': 0.34251584, u'mu_lof': 2.8e-07, u'mu_mis': 2.75e-06, u'mu_syn': 1.06e-06, u'n_exons': 2, u'n_lof': 3, u'n_mis': 28, u'n_syn': 14, u'pLI': 0.01311959, u'pNull': 0.3262246, u'pRec': 0.66065581, u'syn_z': -0.19238513, u'transcript': u'ENST00000302513.2'}

{ u'bp': 486, u'cds_end': 99812168, u'cds_start': 99802666, u'chr': u'2', u'exp_lof': 6.23037702, u'exp_mis': 48.13188456, u'exp_syn': 18.57931905, u'gene': u'C2orf15', u'lof_z': 2.0757045, u'mis_z': -0.41371287, u'mu_lof': 3.9e-07, u'mu_mis': 4.24e-06, u'mu_syn': 1.5e-06, u'n_exons': 5, u'n_lof': 1, u'n_mis': 54, u'n_syn': 15, u'pLI': 0.54325245, u'pNull': 0.01430108, u'pRec': 0.44244647, u'syn_z': 0.5147959, u'transcript': u'ENST00000512183.2'}

tayaza commented 5 years ago

Seems the problem is from the annotation from GENCODE starting from version 19. It would be great to confirm if these are two different ORFs (as suggested by the Ensembl Ids) or if they're isoforms of the same thing.