Closed chunyuma closed 3 months ago
re: item 1: hm, yeah, I'm not sure why these nodes aren't merged in KG2c.6.3. I verified they're also separate in the synonymizer, so maybe @edeutsch has some insight (e.g., --lookup
UMLS:C0006069 and MESH:D001909 for the 'Leukemia Virus, Bovine' example).
re: item 2: I checked and it seems this was already present in KG2c-5-2, so I don't think it's a new issue (that same query returns 2785 records in KG2c-5-2). the KG2c node name is the 'preferred_name' according to the synonymizer - I verified that the synonymizer for some reason seems to say the preferred_name is empty for these nodes. e.g.:
"id": {
"SRI_normalizer_category": "biolink:ChemicalSubstance",
"SRI_normalizer_curie": "CHEMBL.COMPOUND:CHEMBL3302426",
"SRI_normalizer_name": "CHEMBL3302426",
"category": "biolink:ChemicalSubstance",
"identifier": "CHEMBL.COMPOUND:CHEMBL3302426",
"name": ""
},
I'd say neither of these issues seem to be show-stoppers for rolling out KG2.6.3, since the first one affects such a small number of nodes and the second appears to be a bug that was already present (and also doesn't affect a huge number of nodes).
for item 2, although this was already present in kg2c-5-2 but it doesn't make sense that the preferred name is empty but it has other names which are not empty.
yep, I agree, and it's something to fix in the synonymizer. I was just pointing out that it doesn't seem to be a recent change that caused it.
@chunyuma - I believe both of the issues you reported here are fixed in KG2.8.0.1c:
all_names
also has a name
(i.e., your query now returns no results)can we close this issue?
I think this is okay to close.
Also based on the KG2.6.1c (http://kg2canonicalized.rtx.ai:7474/browser/) that @amykglen just built, I found two problems that might be associated with Nodesynonymizer:
1) There are 35 duplicated preferred curies that have same name and same description in KG2.6.1: Here are the names of them:
This is one example:
2) There are 1870 curies in KG2.6.1 that have no name but their synonyms have name: