langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
84.8k stars 13.1k forks source link

[Community] : Added SentenceWindowRetriever #20981

Closed rsk2327 closed 1 week ago

rsk2327 commented 2 weeks ago

Thank you for contributing to LangChain!

Description : Updated TextSplitter to include a new add_chunk_id argument to add a chunk_id variable into document metadata

This is a prerequisite for implementing the Sentence Window Retriever broadly across all databases as it helps easily identify neighboring chunks of text from the same text source

If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.

vercel[bot] commented 2 weeks ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment | Name | Status | Preview | Comments | Updated (UTC) | | :--- | :----- | :------ | :------- | :------ | | **langchain** | ⬜️ Ignored ([Inspect](https://vercel.com/langchain/langchain/exyK57mhjfJ6Scx9j7d1x2o3sKEN)) | [Visit Preview](https://langchain-git-fork-rsk2327-master-langchain.vercel.app) | | May 3, 2024 4:33pm |
rsk2327 commented 2 weeks ago

@eyurtsev @baskaryan This metadata variable is a prerequisite for setting up the new implementation of the Sentence Window Retrieval method.

rsk2327 commented 1 week ago

@hwchase17 @efriis Can i get a review on this?