Test extraction with the BEL representation scheme

To test BEL generation from text, use the BEL small corpus as a source of sentences and associated BEL statements.

https://github.com/cthoyt/selventa-knowledge/blob/master/selventa_knowledge/small_corpus.bel

Use selections from the corpus as examples to provide in the LLM prompt.

Test by processing the entire small corpus. For each evidence text, generate bel statements in the same format as the curated statements. The bel expressions can be directly compared - the order of operators is defined in BEL such that the expression for a given identity is unambiguous. However, in next steps, we can see what might be useful to do to make debugging and scoring easier, such as identify near-matches and diagnose what is different, i.e. subject and interaction match but the object does not.

We do not need to put the BEL into the json form we have been using for the purposes of this test, it will be easier for us to visually compare the BEL expressions when stated in their standard form. When using this to perform knowledge graph extractions, then it will be appropriate to use the current json form.

ndexbio / llm-text-to-knowledge-graph

Test extraction with the BEL representation scheme #8