langchain-ai / langchain

πŸ¦œπŸ”— Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.72k stars 15.33k forks source link

Python kernel crashes when using Chroma's from_texts from langchain #26859

Open Kviilen opened 1 month ago

Kviilen commented 1 month ago

Checked other resources

Example Code

from langchain_community.vectorstores import Chroma embed_model_path = '.././AI-ModelScope/bge-small-en-v1___5' from langchain_huggingface import HuggingFaceEmbeddings embedding = HuggingFaceEmbeddings(model_name=embed_model_path) texts = [ "Test" ] try: smalldb_chinese = Chroma.from_texts(texts, embedding=embedding) except Exception as r: print('%s' %(r))

Error Message and Stack Trace (if applicable)

Process finished with exit code -1073741819 (0xC0000005)

Description

When using the from_texts method of Chroma from langchain, the Python kernel crashes without any error messages. The process finishes with exit code -1073741819 (0xC0000005). This issue occurs consistently and makes it impossible to use this method effectively. I am using the latest version of Chroma from langchain and have tried on different environments but still encounter the same problem. Any help or suggestions to resolve this issue would be greatly appreciated.

System Info

System Information

OS: Windows OS Version: 10.0.22631 Python Version: 3.12.4 | packaged by Anaconda, Inc. | (main, Jun 18 2024, 15:03:56) [MSC v.1929 64 bit (AMD64)]

Package Information

langchain_core: 0.3.5 langchain: 0.3.0 langchain_community: 0.3.0 langsmith: 0.1.125 langchain_experimental: 0.3.0 langchain_huggingface: 0.1.0 langchain_text_splitters: 0.3.0

Optional packages not installed

langgraph langserve

Other Dependencies

aiohttp: 3.9.5 async-timeout: Installed. No version info available. dataclasses-json: 0.6.7 httpx: 0.27.0 huggingface-hub: 0.24.5 jsonpatch: 1.33 numpy: 1.26.4 orjson: 3.10.6 packaging: 23.2 pydantic: 2.8.2 pydantic-settings: 2.5.2 PyYAML: 6.0.1 requests: 2.32.2 sentence-transformers: 3.0.1 SQLAlchemy: 2.0.30 tenacity: 8.5.0 tokenizers: 0.19.1 transformers: 4.44.0 typing-extensions: 4.11.0

air-kyi commented 5 days ago

My kernel crashes with from_documents too.

Example Code

self.embed_model = AzureOpenAIEmbeddings(api_key=self.api_key, 
azure_endpoint=self.endpoint, 
model="text-embedding-ada-002", 
chunk_size=1)

loader = DataFrameLoader(pandas_dataframe, page_content_column='column')
documents = loader.load()

vectorstore = Chroma.from_documents(documents=documents, 
embedding=self.embed_model, 
persist_directory='my-dir')

vectorstore.persist()

Error Message and Stack Trace (if applicable)

11:02:02.230 [info] Generated code for 27 = <ipython-input-27-bf686321592c> with 20 lines
11:02:18.176 [error] Disposing session as kernel process died ExitCode: 3221225477, Reason: 

Description

I was using Chroma with no issue for months but accidentally deleted Conda and now using pip-only venv. Recreated a new venv and now when using the from_documents method of Chroma from langchain, the Python kernel crashes without any error messages. The process finishes with exit code 3221225477. I am using the latest version of Chroma from langchain_chroma and have tried:

System Info

OS: Windows Version 10.0.19045 Build 19045 Python 3.11.1

Package Information

langchain-chroma==0.1.4 β”œβ”€β”€ chromadb [required: >=0.4.0,<0.6.0,!=0.5.5,!=0.5.4, installed: 0.5.18] β”œβ”€β”€ fastapi [required: >=0.95.2,<1, installed: 0.115.4] β”œβ”€β”€ langchain-core [required: >=0.1.40,<0.4, installed: 0.3.15] └── numpy [required: >=1,<2, installed: 1.26.4] langchain-community==0.3.5 β”œβ”€β”€ aiohttp [required: >=3.8.3,<4.0.0, installed: 3.10.10] β”œβ”€β”€ dataclasses-json [required: >=0.5.7,<0.7, installed: 0.6.7] β”œβ”€β”€ httpx-sse [required: >=0.4.0,<0.5.0, installed: 0.4.0] β”œβ”€β”€ langchain [required: >=0.3.6,<0.4.0, installed: 0.3.7] β”œβ”€β”€ langchain-core [required: >=0.3.15,<0.4.0, installed: 0.3.15] β”œβ”€β”€ langsmith [required: >=0.1.125,<0.2.0, installed: 0.1.140] β”œβ”€β”€ numpy [required: >=1,<2, installed: 1.26.4] β”œβ”€β”€ pydantic-settings [required: >=2.4.0,<3.0.0, installed: 2.6.1] β”œβ”€β”€ PyYAML [required: >=5.3, installed: 6.0.2] β”œβ”€β”€ requests [required: >=2,<3, installed: 2.32.3] β”œβ”€β”€ SQLAlchemy [required: >=1.4,<2.0.36, installed: 2.0.35] └── tenacity [required: >=8.1.0,<10,!=8.4.0, installed: 8.5.0] langchain-openai==0.2.6 β”œβ”€β”€ langchain-core [required: >=0.3.15,<0.4.0, installed: 0.3.15] β”œβ”€β”€ openai [required: >=1.54.0,<2.0.0, installed: 1.54.2] └── tiktoken [required: >=0.7,<1, installed: 0.8.0]

Similar issues

https://github.com/chroma-core/chroma/issues/2513 https://github.com/chroma-core/chroma/issues/3058