Closed wrosko closed 2 years ago
Hey Wade!
I chatted with @sierra-moxon and our best guess was that if anything handled this, it would be the clique merge in kgx, and it would only handle if if you also brought in the node with the with more detail for it to be merged with. I tried a little experiment based on a clique merge test in kgx, and it looks like even in that case it looks like it took the more general category in the merge.
We're going in the direction of removing ID-only nodes from our association ingests and using downstream QC checks to look for extra nodes that we need to consciously bring in, but I can see how that might overkill for some use cases. You could look at using Biolink Model Toolkit to guess at categories based on ID prefixes, plenty would be ambiguous, but it might be a start.
Thank for the reply Kevin!
Hi there,
Say we want to incorporate relationships from a new data source and we only know the entity CUIs, not necessarily all other details (attributes etc.). If we create nodes with NamedEntity(id="UMLS:C0121434") without specifying other stuff, would
kgx
/koza
be able to recognize and assign to the correct node if a node with the same id exists? And at what point in the pipeline would this occur?