Closed justanotherlad closed 1 year ago
Hello! It isn't clear from this part of the answer alone how the chunking strategy used in this example is figured out. Various chunking strategies can yield similar results, depending on the configured options. You can find more details here: https://www.pinecone.io/learn/chunking-strategies/.
Moving on to the next part of the question, essentially, by using the context-overlapping method, we ensure that each chunk retains a portion of the context from the previous and subsequent chunks. This approach should ideally maintain the answer within a single chunk every time. You can adjust the length of these overlapping segments and also increase the chunk size if you expect answers with a larger amount of text.
I have
/upsert
'ed a document (JSON) using chatgpt-retrieval-plugin that looks like the following:However, when I
/query
, it returns something like this:Note: Here, first part of the "text" field is missing. What sort of chunking mechanism is being used here? Is it the
RecursiveCharacterTextSplitter
explained here ?Also, if so, what sort of context-overlapping method does it use if one part of answer for the query lies in the first chunk and the second part lies in the next chunk?