monarch-initiative / ontogpt

LLM-based ontological extraction tools, including SPIRES
https://monarch-initiative.github.io/ontogpt/
BSD 3-Clause "New" or "Revised" License
589 stars 74 forks source link

Fixing failure to parse markdown, JSON formatting, and numbered lists #394

Closed caufieldjh closed 3 months ago

caufieldjh commented 3 months ago

One remaining edge case: when lines end in a delimited but don't have an additional entry, like so

organisms: Herpes Simplex Virus I (HSV-1);

they get lost.

caufieldjh commented 3 months ago

This has a fix for #133 too - will need to check that works as expected

caufieldjh commented 3 months ago

Last remaining edge case: parsing numeric lists followed by other numeric lists.