crate-workbench / langchain

⚡ Building applications with LLMs through composability ⚡
https://python.langchain.com
MIT License
0 stars 0 forks source link

delete and delete_collection on CrateDBVectorSearch don't delete the actual embeddings #11

Closed andnig closed 10 months ago

andnig commented 10 months ago

System Info

Langchain 0.0.315

Who can help?

@amotl

Information

Related Components

Reproduction


from langchain.schema import Document

doc = Document(page_content="this is such a nice text")
vector_store = CrateDBVectorSearch.from_documents([doc], OpenAIEmbeddings(), collection_name="wow_such_nice", connection_string="crate://localhost:4200?schema=langchain")
vector_store.delete_collection()

Expected behavior

The collection should be deleted as well as all embeddings which are part of this collection. While the collection gets deleted from the collection table, the embeddings of this collection in the embedding table are still there.

amotl commented 10 months ago

Hi @andnig,

thanks for your report. @ckurze recently reported GH-5, which sounds a bit similar to me, and I think we made a few improvements after that.

You are reporting that you observed the problem with LangChain 0.0.315. The current cratedb branch is on LangChain version 0.0.338. Can I ask you to try again, using the most recent version here?

With kind regards, Andreas.

amotl commented 10 months ago

Hi @andnig,

thanks for your report.

While the collection gets deleted from the collection table, the embeddings of this collection in the embedding table are still there.

The cascading delete does not work because CrateDB doesn't know anything about foreign key relationships. The corresponding operation will probably need to be emulated. I will look into how this could be implemented.

With kind regards, Andreas.

hlcianfagna commented 10 months ago

Relates with https://github.com/crate/crate/issues/1376

ckurze commented 10 months ago

Thanks for reporting - as soon as we slightly change the logic how we handle tables and embeddings (https://github.com/crate-workbench/langchain/issues/12), this operation will translate into a DELETE FROM <collection_name> or DROP TABLE <collection_name>, respectively. We should prioritize https://github.com/crate-workbench/langchain/issues/12 over this issue.

amotl commented 10 months ago

Dear @andnig,

GH-14 fixed this problem. Thanks again for the report.

With kind regards, Andreas.