The RAG agent should be able to both synthesize an answer and provide its references (the metadata / context from the vectors used to generate the answer). We should also have a PDF viewer that jumps to the page where those vectors came from and highlights the passage / chunk that was used in that vector.
This is a complex feature.
The implementation steps would look as follows:
Use layoutparser instead of textsplitter to generate chunks for the embeddings. Instead of doing a simple x-character chunk, we would use this DL library to extract the text elements and also pull out a bounding box for each element that we are chunking alongside the page number. Store this in the embedding metadata.
Use a react PDF viewer library to display the PDF whenever the RAG agent is invoked, and one of its references is clicked (which should be in the uploads directory on the server) that will show up in a right side panel next to the chat UI.
The RAG agent should be able to both synthesize an answer and provide its references (the metadata / context from the vectors used to generate the answer). We should also have a PDF viewer that jumps to the page where those vectors came from and highlights the passage / chunk that was used in that vector.
This is a complex feature.
The implementation steps would look as follows: