The RAG performance of this application can be improved by using more advanced techniques for retrieval and synthesis.
Currently, the app uses the SentenceSplitter node parser to parse text into chunks for each sentence. Then, it uses the same chunk for both context retrieval and response synthesis when querying the LLM. It is not optimal to use the same chunk size for retrieval and synthesis, as a smaller chunk size helps embedding-based retrieval to find more relevant context, while a larger chunk size helps the LLM to synthesize a better response.
Implementing a technique to use a smaller chunk size for retrieval and a larger chunk size for synthesis can help improve RAG performance. Two of these techniques include:
The RAG performance of this application can be improved by using more advanced techniques for retrieval and synthesis.
Currently, the app uses the SentenceSplitter node parser to parse text into chunks for each sentence. Then, it uses the same chunk for both context retrieval and response synthesis when querying the LLM. It is not optimal to use the same chunk size for retrieval and synthesis, as a smaller chunk size helps embedding-based retrieval to find more relevant context, while a larger chunk size helps the LLM to synthesize a better response.
Implementing a technique to use a smaller chunk size for retrieval and a larger chunk size for synthesis can help improve RAG performance. Two of these techniques include: