ndexbio / gsoc_llm

GSOC 2024 LLM Project
MIT License
0 stars 0 forks source link

Start with Entity Tagged Documents and Extract Interactions to Scale Extraction #11

Open cannin opened 6 months ago

cannin commented 6 months ago
  1. Use PubTator tagged documents and filter down to sentences that have at least two entities labeled as Gene or Chemical that might form an interaction. You will need to write code to process tagged paragraphs to tagged sentences.

Site: https://www.ncbi.nlm.nih.gov/research/pubtator3/publication/29702687?text=PMC6044858 XML: https://www.ncbi.nlm.nih.gov/research/pubtator3-api/publications/pmc_export/biocxml?pmcids=PMC6044858

  1. Do benchmarking on various articles to determine speed improvements with and without this filtering.