I am trying to build chatbot based on FAQ documentation. It uses text file as a list of question-answer pairs.
However, base chunking strategy sometimes splits chunks in the middle of an answer or between question and answer.
It seriously undermines quality of answers. Is there any way to customise chunking strategy so I can make sure questions AND answer appear in the same chunk fully?
What comes to my mind is some special character that indicates chunk split like endoftext or smth.
I am trying to build chatbot based on FAQ documentation. It uses text file as a list of question-answer pairs. However, base chunking strategy sometimes splits chunks in the middle of an answer or between question and answer.
It seriously undermines quality of answers. Is there any way to customise chunking strategy so I can make sure questions AND answer appear in the same chunk fully?
What comes to my mind is some special character that indicates chunk split like endoftext or smth.