Closed davidsbatista closed 1 month ago
This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Totals | |
---|---|
Change from base Build 10592675152: | 0.2% |
Covered Lines: | 7021 |
Relevant Lines: | 7770 |
Since this change is related to how Pinecone handles metadata, I think it would be simpler and more appropriate to intervene on the Pinecone side.
WDYT?
It's a good suggestion, but we can't just blindly convert everything back to integers.
To make this generic, we need to know which type the values were originally, i.e.: before being stored in Pinecone; and I don't know where to store that information.
Yes, I understand.
I'm simply suggesting that we move the _convert_to_int
method to Pinecone with the same keys used here.
Unrelated: I have the impression that the fact that our DocumentSplitter
creates a page_number
meta field poses risks of overriding user-provided information.
``
Yes, I understand.
I'm simply suggesting that we move the
_convert_to_int
method to Pinecone with the same keys used here.ah ok, now I understand what you suggested - the only thing is that for the tests we will need to import Pinecone/add them to the CI
ah ok, now I understand what you suggested - the only thing is that for the tests we will need to import Pinecone/add them to the CI
No, I suggest modifying the Pinecone Document Store in core-integrations.
moving to core-integrations
Proposed Changes:
Pinecone converts all meta numbers in the meta field to float (https://docs.pinecone.io/guides/data/filter-with-metadata). This causes the
SentenceWindowRetriever
to crash completely.This PR checks if the metadata values are floats and converts them back to integers making
PineCone
supported by theSentenceWindowRetriever
How did you test it?
Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
.