langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
84.75k stars 13.09k forks source link

community: DuckDB VS - expose similarity, improve performance of from_texts #20971

Open jaceksan opened 2 weeks ago

jaceksan commented 2 weeks ago

3 fixes of DuckDB vector store:

Dependencies: added Pandas to speed up from_documents. I was thinking about CSV and JSON options, but I expect trouble loading JSON values this way and also CSV and JSON options require storing data to disk. Anyway, the poetry file for langchain-community already contains a dependency on Pandas.

vercel[bot] commented 2 weeks ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 3, 2024 8:47am
jaceksan commented 1 week ago

@baskaryan @eyurtsev should I try to fix the failing test? To be honest, I don't know why it started to fail. The import in the notebook is identical to what I use in my code base...well, expect I import from langchain_community and the notebook imports from langchain. I tried to change it to langchain_community but it did not help.

Any suggestion would be appreciated.