Open mrm1001 opened 5 months ago
I think in addition to the proposed points (which mostly focus on retrieval), it would also be good to add tasks that focus on answer generation with LLMs. For example,
Also for the retrieval side of things:
@sjrl In my experience, basic arithmetic has been a challenge for all LLMs I've worked with. Even though some LLMs, like code models, might perform slightly better at these tasks, their accuracy remains inconsistent, making them unreliable for production use.
For indexing PDFs, I suggest we could develop an agent that processes the document chunk by chunk in an interactive manner, extracting specific facts such as a company's net profit. This agent could add references to tables, images, and other relevant elements, which we could then use to enhance the metadata of those elements. By indexing all chunks in a document store and maintaining separate storage for tables and images, we could preserve context during retrieval. When a table needs to be accessed, the retrieval process would pull up a passage that references it. Additionally, filtering rows and columns of tables before passing them to the LLM might help avoid confusion.
I understand this needs testing, but I am confident in its potential. I'll start working on this after my exams.
@mrm1001 If you want to add state of the art tabular QA to haystack, we should start to read papers on the matter and discuss how to implement the discussed features in the library. I have found this pretty useful in my previous projects. I will start to list some papers to read and try to make some kind of guidelines to study the matter in some kind of readding group or something, if someone is willing to help me I would be delighted to collaborate.
If you have any other proposal, I am open to collaborate on the matter however you suggest.
These are some techniques to assess and decide whether it's worth trying or not, but it's up to the assignee to try different techniques that will work on the selected dataset.