implement full paper RAG to assist extraction

shababo / plotreader

Use LLMs to generate and read plots. Combine them into a teacher-student pair to improve performance.

0 stars 0 forks source link

so far we've only looked at a PNG of the figure and caption to extract data. in the main example we've been testing with, it does not properly estimate the wavelength values that were used. however, they are explicitly enumerated in the methods section.

we already implemented a MultimodalDocumentHandler, but when I just tested parsing the PDF on llama cloud, it did not do a great job.

[x] verify/optimize pdf parsing
[x] verify/optimize query engine over pdf
[x] incorporate query engine into data extraction

for now, we'll keep working primarily in notebooks. once this part is up, we should be able to create some classes and API which should facilitate further testing.

shababo / plotreader

implement full paper RAG to assist extraction #3