Open caufieldjh opened 4 months ago
I'm not following the part about KG embeddings. I don't think we'd want a dependency on GRAPE here. But we want to support people providing their own embeddings e.g. via venomx. However I don't get how GRAPE/node2vec style embeddings would work with RAG.
Good suggestion to explore llamaindex. But I think this is orthogonal. See #34
Not sure what exactly Marco had in mind for using KG embeddings with RAG, but possibly something like read in abstracts that may contain relations of interest, do NER/ground to get IDs/CURIEs of interest from text, then pull these and any related nodes using KG embeddings and send along for context? Not sure
Also, agree that a GRAPE dependency might not be what we want here. I've made a (draft) PR #36 to support pulling embeddings from huggingface or any other URL
In discussion with RNA-KG group (Marco Mesiti, Elena Casiraghi, Emanuele Cavalleri) and @justaddcoffee - we would like to be able to extract triples (s, p, o) from a provided text, using graph embeddings to guide the process. The goal is to find additional content for RNA-KG. Using OntoGPT has worked well for this so far but does not take advantage of the existing relations within the KG.
This would involve:
Integrating some process for comparison of the extracted triples would be ideal (e.g., A vs B appears in 20 documents, 15 of them from different sources, etc).
RNA-KG group has also suggested trying an alternative vector DB (https://www.llamaindex.ai/) to see if it works better for RAG with KG data.