Closed iknoorjobs closed 11 months ago
Hi @iknoorjobs, you might have duplicates ids in your set of documents. Since the output of encode_documents is a dict document_id: embedding, it drop duplicates.
You should avoid duplicates in your documents and in your queries
Hi,
Thanks for the great work on the repo.
While attempting to encode documents using my custom dataset, I've encountered a discrepancy in the number of input docs and the number of embedding produced. For eg.
Do you have any insight into what might be causing this issue?
Thanks