Open dexterpratt opened 1 week ago
To test BEL generation from text, use the BEL small corpus as a source of sentences and associated BEL statements.
https://github.com/cthoyt/selventa-knowledge/blob/master/selventa_knowledge/small_corpus.bel
Use selections from the corpus as examples to provide in the LLM prompt.
Test by processing the entire small corpus. For each evidence text, generate bel statements in the same format as the curated statements. The bel expressions can be directly compared - the order of operators is defined in BEL such that the expression for a given identity is unambiguous. However, in next steps, we can see what might be useful to do to make debugging and scoring easier, such as identify near-matches and diagnose what is different, i.e. subject and interaction match but the object does not.
We do not need to put the BEL into the json form we have been using for the purposes of this test, it will be easier for us to visually compare the BEL expressions when stated in their standard form. When using this to perform knowledge graph extractions, then it will be appropriate to use the current json form.
Try a version of the prompt with the BEL vocabulary defined instead of the INDRA vocabulary
BEL Documentation
the Cheatsheet might be a good definition to put in the prompt
We want to use the HGNC grounding for the BEL statements