langchain-ai / langchain-redis

MIT License
14 stars 9 forks source link

similarity_search results missing 'id' value... #27

Open hschmied opened 1 month ago

hschmied commented 1 month ago

Quick question -- I noticed, that similarity_search (vectorstores.py) returns Document-objects without the 'id' being set. Is this intentional or an oversight.

For testing purposes I locally fixed it for my use-case like...

return [
    Document(
        id=result[self.config.id_field],                       ## << add it here
        page_content=doc[self.config.content_field],
        metadata={
            k: v
            for k, v in doc.items()
            if k != self.config.content_field
        },
    )
    for doc, result in zip(full_docs, results)                 ## get id from results
    if doc is not None  # Handle potential missing documents
]

There's a couple places that have a similar behavior (...with_score, ...with_vectors, ...storage_type). I didn't want to rush ahead, change stuff and send a PR, because maybe there's a good reason.

Thanks!

rbs333 commented 2 weeks ago

@hschmied Hey thanks for the feedback we're looking into it! cc: @bsbodden

bsbodden commented 2 weeks ago

Looks like it was added as optional in langchain core :

# The ID field is optional at the moment.
# It will likely become required in a future major release after
# it has been adopted by enough vectorstore implementations.
id: Optional[str] = None
"""An optional identifier for the document.