Add a simple Question Answering notebook with Haystack

redhat-et / foundation-models-for-documentation

Improve ROSA customer experience (and customer retention) by leveraging foundation models to do “gpt-chat” style search of Red Hat customer documentation assets.

Other

26 stars 12 forks source link

Add a simple Question Answering notebook with Haystack #12

Closed codificat closed 1 year ago

codificat commented 1 year ago

In #9 we are exploring various QA systems.

This PR provides a simple experiment of Extractive and Generative QA using Haystack

review-notebook-app[bot] commented 1 year ago

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

codificat commented 1 year ago

NOTE: this PR also includes the sample dataset from #11 (same commit) in order to have data to work on.

codificat commented 1 year ago

Converted this PR to draft while I'm working to expand it with a Generative QA approach

codificat commented 1 year ago

Updated with the current version that adds 3 generative QA types: RAG, LFQA and OpenAI-based.

Context now includes the full ROSA docs (plus the ROSA workshop and the MOBB material in the data/external samples)

Results are not great, I'm still trying to see if they can be improved a bit - also need to elaborate/document.

codificat commented 1 year ago

Ok, I believe this is ready for another review.

The RAG version is not working well for some reason that so far has escaped me. @Shreyanand @suppathak if you have suggestions especially on that part they would be most welcome.

I have added the retrieval of the whole ROSA docs from S3 storage, and these docs together with the in-repo samples (ROSA workshop and MOBB) are used for context.

There are now more comments/docs and the structure has also been updated, hopefully making it more easy to follow.

codificat commented 1 year ago

Another update:

I now removed the RAG generator test: it does not work, and deepset plan to remove the RAG generator tutorial.
I found a problem with the Markdown pre-processor. I mention it as "fixmes" in the notebook. I am inclined NOT to try to fix these in this notebook though: the purpose of this PR is to review Haystack as a framework, and it there are issues with the markdown pre-processor they should be mentioned, right?