Open Kunlun-Zhu opened 6 months ago
@ignorejjj check this
The retriever will retrieve similar items (including ID and text) from the document corpus. As I understand it, document chunking is employed during corpus construction and does not need to be returned by the retriever.
For the generator, due to various limitations of the black-box model (can't return logits, requiring API costs), we did not implement it initially. To ensure completeness, we plan to implement mainstream API-based models, such as ChatGPT within the next few weeks.
If I have misunderstood anything, please feel free to make suggestions!
Thanks for the reply, looking forward to new updates.
hope that I can use this to chunk my html ,hhh
To my best understanding.
The retriever only returns the doc ID without the chunking method for each document.
I would also suggest API usage for chatGPT, Gemini, Claude, etc in the generator.