Closed chunyuma closed 3 years ago
These appear to all be bacterial genes that came in via the Protein Ontology (pr.owl
).
These appear to be accession numbers in the Ensembl Bacteria database
Ah, I see, thanks @saramsey!
This question has been solved. I'll go to close this issue. Thanks @saramsey!
I'm not sure that there is a clean way to represent the KS URIs for these genes, with a CURIE. Ensembl Bacteria (most unfortunately) encodes the species name in the base URI, like this. For the accession number XOO0607
, the URI is https://bacteria.ensembl.org/Xanthomonas_oryzae_pv_oryzae_kacc_10331_gca_000007385/Gene/Summary?g=XOO0607;r=Chromosome:634770-636098;t=AAW73861;db=core
Second bummer-- identifiers.org has a registry for Ensembl Bacteria, but its CURIE IDs don't seem to be based on the accession IDs that we are getting from pr.owl
.
This makes sense why iri is invalid for these genes.
Actually, I think there might be a way to fix this based on what we've learned from #1366.
For 73 out of 75 these, the EnsemblGenome base IRI (www.ensemblgenomes.org/id/
) works with these ids:
Row | Original ID | Original IRI | New IRI | Resolves? |
---|---|---|---|---|
1 | "ENSEMBL:SAR0196" | "https://identifiers.org/ensembl:SAR0196" | www.ensemblgenomes.org/id/SAR0196 | No |
2 | "ENSEMBL:SERP2474" | "https://identifiers.org/ensembl:SERP2474" | www.ensemblgenomes.org/id/SERP2474 | Yes |
3 | "ENSEMBL:BAA35363" | "https://identifiers.org/ensembl:BAA35363" | www.ensemblgenomes.org/id/BAA35363 | Yes |
4 | "ENSEMBL:BAA35975" | "https://identifiers.org/ensembl:BAA35975" | www.ensemblgenomes.org/id/BAA35975 | Yes |
5 | "ENSEMBL:BAA35977" | "https://identifiers.org/ensembl:BAA35977" | www.ensemblgenomes.org/id/BAA35977 | Yes |
6 | "ENSEMBL:SAV0195" | "https://identifiers.org/ensembl:SAV0195" | www.ensemblgenomes.org/id/SAV0195 | Yes |
7 | "ENSEMBL:XOO0607" | "https://identifiers.org/ensembl:XOO0607" | www.ensemblgenomes.org/id/XOO0607 | Yes |
8 | "ENSEMBL:BAB41410" | "https://identifiers.org/ensembl:BAB41410" | www.ensemblgenomes.org/id/BAB41410 | Yes |
9 | "ENSEMBL:BAA77927" | "https://identifiers.org/ensembl:BAA77927" | www.ensemblgenomes.org/id/BAA77927 | Yes |
10 | "ENSEMBL:BAB94034" | "https://identifiers.org/ensembl:BAB94034" | www.ensemblgenomes.org/id/BAB94034 | Yes |
11 | "ENSEMBL:BAA22515" | "https://identifiers.org/ensembl:BAA22515" | www.ensemblgenomes.org/id/BAA22515 | Yes |
12 | "ENSEMBL:BAA36098" | "https://identifiers.org/ensembl:BAA36098" | www.ensemblgenomes.org/id/BAA36098 | Yes |
13 | "ENSEMBL:AT1G05760" | "https://identifiers.org/ensembl:AT1G05760" | www.ensemblgenomes.org/id/AT1G05760 | Yes |
14 | "ENSEMBL:SACOL0180" | "https://identifiers.org/ensembl:SACOL0180" | www.ensemblgenomes.org/id/SACOL0180 | Yes |
15 | "ENSEMBL:BAB96624" | "https://identifiers.org/ensembl:BAB96624" | www.ensemblgenomes.org/id/BAB96624 | Yes |
16 | "ENSEMBL:BAA16223" | "https://identifiers.org/ensembl:BAA16223" | www.ensemblgenomes.org/id/BAA16223 | Yes |
17 | "ENSEMBL:BAE76052" | "https://identifiers.org/ensembl:BAE76052" | www.ensemblgenomes.org/id/BAE76052 | Yes |
18 | "ENSEMBL:BAE76028" | "https://identifiers.org/ensembl:BAE76028" | www.ensemblgenomes.org/id/BAE76028 | Yes |
19 | "ENSEMBL:BAE76450" | "https://identifiers.org/ensembl:BAE76450" | www.ensemblgenomes.org/id/BAE76450 | Yes |
20 | "ENSEMBL:BAA15962" | "https://identifiers.org/ensembl:BAA15962" | www.ensemblgenomes.org/id/BAA15962 | Yes |
21 | "ENSEMBL:BAA16126" | "https://identifiers.org/ensembl:BAA16126" | www.ensemblgenomes.org/id/BAA16126 | Yes |
22 | "ENSEMBL:BAA16125" | "https://identifiers.org/ensembl:BAA16125" | www.ensemblgenomes.org/id/BAA16125 | Yes |
23 | "ENSEMBL:BAA16005" | "https://identifiers.org/ensembl:BAA16005" | www.ensemblgenomes.org/id/BAA16005 | Yes |
24 | "ENSEMBL:BAA16002" | "https://identifiers.org/ensembl:BAA16002" | www.ensemblgenomes.org/id/BAA16002 | Yes |
25 | "ENSEMBL:BAA16506" | "https://identifiers.org/ensembl:BAA16506" | www.ensemblgenomes.org/id/BAA16506 | Yes |
26 | "ENSEMBL:BAE76930" | "https://identifiers.org/ensembl:BAE76930" | www.ensemblgenomes.org/id/BAE76930 | Yes |
27 | "ENSEMBL:BAE76924" | "https://identifiers.org/ensembl:BAE76924" | www.ensemblgenomes.org/id/BAE76924 | Yes |
28 | "ENSEMBL:BAE76923" | "https://identifiers.org/ensembl:BAE76923" | www.ensemblgenomes.org/id/BAE76923 | Yes |
29 | "ENSEMBL:BAE76567" | "https://identifiers.org/ensembl:BAE76567" | www.ensemblgenomes.org/id/BAE76567 | Yes |
30 | "ENSEMBL:BAE76566" | "https://identifiers.org/ensembl:BAE76566" | www.ensemblgenomes.org/id/BAE76566 | Yes |
31 | "ENSEMBL:BAE76598" | "https://identifiers.org/ensembl:BAE76598" | www.ensemblgenomes.org/id/BAE76598 | Yes |
32 | "ENSEMBL:BAA16036" | "https://identifiers.org/ensembl:BAA16036" | www.ensemblgenomes.org/id/BAA16036 | Yes |
33 | "ENSEMBL:BAE76280" | "https://identifiers.org/ensembl:BAE76280" | www.ensemblgenomes.org/id/BAE76280 | Yes |
34 | "ENSEMBL:BAE76418" | "https://identifiers.org/ensembl:BAE76418" | www.ensemblgenomes.org/id/BAE76418 | Yes |
35 | "ENSEMBL:BAE76415" | "https://identifiers.org/ensembl:BAE76415" | www.ensemblgenomes.org/id/BAE76415 | Yes |
36 | "ENSEMBL:BAE76473" | "https://identifiers.org/ensembl:BAE76473" | www.ensemblgenomes.org/id/BAE76473 | Yes |
37 | "ENSEMBL:BAE76475" | "https://identifiers.org/ensembl:BAE76475" | www.ensemblgenomes.org/id/BAE76475 | Yes |
38 | "ENSEMBL:BAE76474" | "https://identifiers.org/ensembl:BAE76474" | www.ensemblgenomes.org/id/BAE76474 | Yes |
39 | "ENSEMBL:BAE76402" | "https://identifiers.org/ensembl:BAE76402" | www.ensemblgenomes.org/id/BAE76402 | Yes |
40 | "ENSEMBL:BAE76703" | "https://identifiers.org/ensembl:BAE76703" | www.ensemblgenomes.org/id/BAE76703 | Yes |
41 | "ENSEMBL:BAE76776" | "https://identifiers.org/ensembl:BAE76776" | www.ensemblgenomes.org/id/BAE76776 | Yes |
42 | "ENSEMBL:BAE76696" | "https://identifiers.org/ensembl:BAE76696" | www.ensemblgenomes.org/id/BAE76696 | Yes |
43 | "ENSEMBL:BAA14961" | "https://identifiers.org/ensembl:BAA14961" | www.ensemblgenomes.org/id/BAA14961 | Yes |
44 | "ENSEMBL:BAE76066" | "https://identifiers.org/ensembl:BAE76066" | www.ensemblgenomes.org/id/BAE76066 | Yes |
45 | "ENSEMBL:BAE76060" | "https://identifiers.org/ensembl:BAE76060" | www.ensemblgenomes.org/id/BAE76060 | Yes |
46 | "ENSEMBL:BAE76370" | "https://identifiers.org/ensembl:BAE76370" | www.ensemblgenomes.org/id/BAE76370 | Yes |
47 | "ENSEMBL:BAE76371" | "https://identifiers.org/ensembl:BAE76371" | www.ensemblgenomes.org/id/BAE76371 | Yes |
48 | "ENSEMBL:BAE76355" | "https://identifiers.org/ensembl:BAE76355" | www.ensemblgenomes.org/id/BAE76355 | Yes |
49 | "ENSEMBL:BAA15248" | "https://identifiers.org/ensembl:BAA15248" | www.ensemblgenomes.org/id/BAA15248 | Yes |
50 | "ENSEMBL:BAE76338" | "https://identifiers.org/ensembl:BAE76338" | www.ensemblgenomes.org/id/BAE76338 | Yes |
51 | "ENSEMBL:BAA15178" | "https://identifiers.org/ensembl:BAA15178" | www.ensemblgenomes.org/id/BAA15178 | Yes |
52 | "ENSEMBL:BAA15121" | "https://identifiers.org/ensembl:BAA15121" | www.ensemblgenomes.org/id/BAA15121 | Yes |
53 | "ENSEMBL:BAE78334" | "https://identifiers.org/ensembl:BAE78334" | www.ensemblgenomes.org/id/BAE78334 | Yes |
54 | "ENSEMBL:BAA15252" | "https://identifiers.org/ensembl:BAA15252" | www.ensemblgenomes.org/id/BAA15252 | Yes |
55 | "ENSEMBL:BAA15090" | "https://identifiers.org/ensembl:BAA15090" | www.ensemblgenomes.org/id/BAA15090 | Yes |
56 | "ENSEMBL:BAA15823" | "https://identifiers.org/ensembl:BAA15823" | www.ensemblgenomes.org/id/BAA15823 | Yes |
57 | "ENSEMBL:BAA15089" | "https://identifiers.org/ensembl:BAA15089" | www.ensemblgenomes.org/id/BAA15089 | Yes |
58 | "ENSEMBL:BAE78086" | "https://identifiers.org/ensembl:BAE78086" | www.ensemblgenomes.org/id/BAE78086 | Yes |
59 | "ENSEMBL:BAB75331" | "https://identifiers.org/ensembl:BAB75331" | www.ensemblgenomes.org/id/BAB75331 | Yes |
60 | "ENSEMBL:BAB75330" | "https://identifiers.org/ensembl:BAB75330" | www.ensemblgenomes.org/id/BAB75330 | Yes |
61 | "ENSEMBL:BAE77042" | "https://identifiers.org/ensembl:BAE77042" | www.ensemblgenomes.org/id/BAE77042 | Yes |
62 | "ENSEMBL:BAE77031" | "https://identifiers.org/ensembl:BAE77031" | www.ensemblgenomes.org/id/BAE77031 | Yes |
63 | "ENSEMBL:BAE77004" | "https://identifiers.org/ensembl:BAE77004" | www.ensemblgenomes.org/id/BAE77004 | Yes |
64 | "ENSEMBL:BAE77163" | "https://identifiers.org/ensembl:BAE77163" | www.ensemblgenomes.org/id/BAE77163 | Yes |
65 | "ENSEMBL:BAE77611" | "https://identifiers.org/ensembl:BAE77611" | www.ensemblgenomes.org/id/BAE77611 | Yes |
66 | "ENSEMBL:BAE77516" | "https://identifiers.org/ensembl:BAE77516" | www.ensemblgenomes.org/id/BAE77516 | Yes |
67 | "ENSEMBL:BAE77585" | "https://identifiers.org/ensembl:BAE77585" | www.ensemblgenomes.org/id/BAE77585 | Yes |
68 | "ENSEMBL:BAE77582" | "https://identifiers.org/ensembl:BAE77582" | www.ensemblgenomes.org/id/BAE77582 | Yes |
69 | "ENSEMBL:BAE77578" | "https://identifiers.org/ensembl:BAE77578" | www.ensemblgenomes.org/id/BAE77578 | Yes |
70 | "ENSEMBL:BAE77577" | "https://identifiers.org/ensembl:BAE77577" | www.ensemblgenomes.org/id/BAE77577 | Yes |
71 | "ENSEMBL:BAE77865" | "https://identifiers.org/ensembl:BAE77865" | www.ensemblgenomes.org/id/BAE77865 | Yes |
72 | "ENSEMBL:BAE77493" | "https://identifiers.org/ensembl:BAE77493" | www.ensemblgenomes.org/id/BAE77493 | Yes |
73 | "ENSEMBL:BAE77309" | "https://identifiers.org/ensembl:BAE77309" | www.ensemblgenomes.org/id/BAE77309 | Yes |
74 | "ENSEMBL:BAC15292" | "https://identifiers.org/ensembl:BAC15292" | www.ensemblgenomes.org/id/BAC15292 | Yes |
75 | "ENSEMBL:LRG_1173" | "https://identifiers.org/ensembl:LRG_1173" | www.ensemblgenomes.org/id/LRG_1173 | No |
6251a33 should fix this problem, based on my testing. It uses the regex for ENSEMBL to identify nodes that pr.owl
assigned as Ensembl:
that are actually EnsemblGenomes:
nodes. The only node from the table above that wasn't impacted by the change, as far as I can tell, is ENSEMBL:LRG_1173
, but a) the change wouldn't help it and b) I could not identify where that node was coming from. A possible location for it appeared to be umls-hgnc.owl
, which contains the line
<LOCUS_SPECIFIC_DB_XR rdf:datatype="http://www.w3.org/2001/XMLSchema#string">LRG_1173&#x7C;http://ftp.ebi.ac.uk/pub/databases/lrgex/LRG_1173.xml</LOCUS_SPECIFIC_DB_XR>
but the cypher command match (n {id: 'ENSEMBL:LRG_1173'}) return n.provided_by
returns identifiers_org_registry:ensembl
.
I tested this code on every owl file that contained or possibly contained Ensembl IDs (only pr.owl
was impacted). Those tested are:
@saramsey Does this seem okay?
Nice work, @ericawood! Looks like a good fix.
This appears to be as fixed as expected in KG2.6.0
: (such that EnsemblGenomes:SAR0196
and ENSEMBL:LRG_1173
are still a bit funky)
match (n) where split(n.id, ':')[1] in ["SAR0196", "SERP2474", "BAA35363", "BAA35975", "BAA35977", "SAV0195", "XOO0607", "BAB41410", "BAA77927", "BAB94034", "BAA22515", "BAA36098", "AT1G05760", "SACOL0180", "BAB96624", "BAA16223", "BAE76052", "BAE76028", "BAE76450", "BAA15962", "BAA16126", "BAA16125", "BAA16005", "BAA16002", "BAA16506", "BAE76930", "BAE76924", "BAE76923", "BAE76567", "BAE76566", "BAE76598", "BAA16036", "BAE76280", "BAE76418", "BAE76415", "BAE76473", "BAE76475", "BAE76474", "BAE76402", "BAE76703", "BAE76776", "BAE76696", "BAA14961", "BAE76066", "BAE76060", "BAE76370", "BAE76371", "BAE76355", "BAA15248", "BAE76338", "BAA15178", "BAA15121", "BAE78334", "BAA15252", "BAA15090", "BAA15823", "BAA15089", "BAE78086", "BAB75331", "BAB75330", "BAE77042", "BAE77031", "BAE77004", "BAE77163", "BAE77611", "BAE77516", "BAE77585", "BAE77582", "BAE77578", "BAE77577", "BAE77865", "BAE77493", "BAE77309", "BAC15292", "LRG_1173"] return n.id, n.provided_by, n.iri
n.id | n.provided_by | n.iri |
---|---|---|
"EnsemblGenomes:SAR0196" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/SAR0196" |
"EnsemblGenomes:SERP2474" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/SERP2474" |
"EnsemblGenomes:BAA35975" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA35975" |
"EnsemblGenomes:BAA35977" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA35977" |
"EnsemblGenomes:BAA35363" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA35363" |
"EnsemblGenomes:SAV0195" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/SAV0195" |
"EnsemblGenomes:XOO0607" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/XOO0607" |
"EnsemblGenomes:BAB41410" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAB41410" |
"EnsemblGenomes:BAA77927" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA77927" |
"EnsemblGenomes:BAA36098" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA36098" |
"EnsemblGenomes:BAA22515" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA22515" |
"EnsemblGenomes:BAB94034" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAB94034" |
"EnsemblGenomes:SACOL0180" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/SACOL0180" |
"EnsemblGenomes:BAB96624" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAB96624" |
"EnsemblGenomes:BAE76703" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76703" |
"EnsemblGenomes:BAE76776" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76776" |
"EnsemblGenomes:BAA15962" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA15962" |
"EnsemblGenomes:BAE76598" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76598" |
"EnsemblGenomes:BAE76930" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76930" |
"EnsemblGenomes:BAE76924" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76924" |
"EnsemblGenomes:BAE76923" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76923" |
"EnsemblGenomes:BAA16506" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA16506" |
"EnsemblGenomes:BAE76567" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76567" |
"EnsemblGenomes:BAE76450" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76450" |
"EnsemblGenomes:BAE76418" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76418" |
"EnsemblGenomes:BAE76415" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76415" |
"EnsemblGenomes:BAE76473" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76473" |
"EnsemblGenomes:BAE76475" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76475" |
"EnsemblGenomes:BAE76474" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76474" |
"EnsemblGenomes:BAE76338" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76338" |
"EnsemblGenomes:BAE76402" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76402" |
"EnsemblGenomes:BAE76052" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76052" |
"EnsemblGenomes:BAE76028" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76028" |
"EnsemblGenomes:BAE76696" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76696" |
"EnsemblGenomes:BAA16223" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA16223" |
"EnsemblGenomes:BAE76066" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76066" |
"EnsemblGenomes:BAE76060" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76060" |
"EnsemblGenomes:BAA16126" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA16126" |
"EnsemblGenomes:BAA16125" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA16125" |
"EnsemblGenomes:BAE76370" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76370" |
"EnsemblGenomes:BAE76371" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76371" |
"EnsemblGenomes:BAE76355" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76355" |
"EnsemblGenomes:BAA16005" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA16005" |
"EnsemblGenomes:BAA16002" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA16002" |
"EnsemblGenomes:BAA16036" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA16036" |
"EnsemblGenomes:BAE76280" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE76280" |
"EnsemblGenomes:BAA14961" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA14961" |
"EnsemblGenomes:BAA15178" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA15178" |
"EnsemblGenomes:BAA15252" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA15252" |
"EnsemblGenomes:BAA15248" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA15248" |
"EnsemblGenomes:BAA15090" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA15090" |
"EnsemblGenomes:BAA15823" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA15823" |
"EnsemblGenomes:BAE78334" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE78334" |
"EnsemblGenomes:BAA15089" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAA15089" |
"EnsemblGenomes:BAE78086" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE78086" |
"EnsemblGenomes:BAB75331" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAB75331" |
"EnsemblGenomes:BAB75330" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAB75330" |
"EnsemblGenomes:BAE77004" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE77004" |
"EnsemblGenomes:BAE77042" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE77042" |
"EnsemblGenomes:BAE77031" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE77031" |
"EnsemblGenomes:BAE77163" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE77163" |
"EnsemblGenomes:BAE77309" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE77309" |
"EnsemblGenomes:BAE77865" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE77865" |
"EnsemblGenomes:BAE77493" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE77493" |
"EnsemblGenomes:BAE77611" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE77611" |
"EnsemblGenomes:BAE77585" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE77585" |
"EnsemblGenomes:BAE77582" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE77582" |
"EnsemblGenomes:BAE77578" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE77578" |
"EnsemblGenomes:BAE77577" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAE77577" |
"EnsemblGenomes:BAC15292" | "OBO:pr.owl" | "http://www.ensemblgenomes.org/id/BAC15292" |
"ENSEMBL:LRG_1173" | "identifiers_org_registry:ensembl" | "https://identifiers.org/ensembl:LRG_1173" |
Hi @saramsey, @kvarforl and @ericawood,
I found something strange regarding
ENSEMBL
genes in KG2.5.2c.I queried
match (n)-[]-(m) where n.category='biolink:Gene' and split(n.id,':')[0]='ENSEMBL' return distinct split(m.id,':')[0], count(distinct m.id)
in KG2.5.2c and found that most ofENSEMBL
genes are mainly connected to a limited of curies:You can see that these
ENSEMBEL
genes even don't have any connection with each other. When I looked into this, theseENSEMBEL
genes mainly connect to those 6NCBI Taxon
curies.In addition, in these
ENSEMBEL
genes, there are 75ENSEMBL
genes whoseiri
are invalid and I also can't even find these IDs in ENSEMBEL website: