PromtEngineer / localGPT

Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.
Apache License 2.0
19.54k stars 2.19k forks source link

problem when ingesting (just CPU) #783

Open alexmc6 opened 3 months ago

alexmc6 commented 3 months ago

I have set things up on an ubuntu 22.04 box as per instructions. anaconda installed, new 3.10 python env created. I get part way through python ingest.py --device_type cpu

  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/__init__.py", line 143, in Client
    api = system.instance(API)
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 195, in instance
    impl = type(self)
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/api/segment.py", line 82, in __init__
    self._manager = self.require(SegmentManager)
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 134, in require
    inst = self._system.instance(type)
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 192, in instance
    type = get_class(fqn, type)
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 239, in get_class
    module = importlib.import_module(module_name)
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/segment/impl/manager/local.py", line 13, in <module>
    from chromadb.segment.impl.vector.local_persistent_hnsw import (
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/segment/impl/vector/local_persistent_hnsw.py", line 9, in <module>
    from chromadb.segment.impl.vector.local_hnsw import (
  File "/home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/segment/impl/vector/local_hnsw.py", line 21, in <module>
    import hnswlib
ImportError: /home/alex2/anaconda3/envs/localGPT/lib/python3.10/site-packages/hnswlib.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZNSt15__exception_ptr13exception_ptr10_M_releaseEv

This looks like it might be the same problem as

https://github.com/langchain-ai/langchain/issues/3017

The discussion there suggests I may have the wrong version of hnswlib and recommends

pip install hnswlib --user --no-build-isolation
pip install chromadb --user

However, I can't install hnswlib that way as it crashes too - in a different way.

Is there anything more I should check? Has anyone done this option recently?

(PS this machine has no proper GPU - just the on board graphics, so I am using the CPU only setting for now, 32Gb, i5, intel MB, ubuntu 22.04).

I am just ingesting the one pdf which came with the git repo.

kirkchongushazard commented 1 month ago

man oh man, this is also not working for me on pc only @PromtEngineer

🔴] × python ingest.py --device_type opencl (localGPT) 2024-05-15 04:59:51,314 - INFO - ingest.py:147 - Loading documents from /home/rob/localGPT/SOURCE_DOCUMENTS Importing: Orca_paper.pdf 2024-05-15 04:59:51,334 - INFO - ingest.py:47 - Loading document batch /home/rob/localGPT/SOURCE_DOCUMENTS/Orca_paper.pdf loaded.

2024-05-15 05:00:24,027 - INFO - :241 - pikepdf C++ to Python logger bridge initialized 2024-05-15 05:00:55,455 - INFO - ingest.py:156 - Loaded 1 documents from /home/rob/localGPT/SOURCE_DOCUMENTS 2024-05-15 05:00:55,455 - INFO - ingest.py:157 - Split into 193 chunks of text 2024-05-15 05:01:02,291 - INFO - SentenceTransformer.py:66 - Load pretrained SentenceTransformer: colbert-ir/colbertv2.0 2024-05-15 05:01:03,468 - WARNING - SentenceTransformer.py:805 - No sentence-transformers model found with name /home/rob/.cache/torch/sentence_transformers/colbert-ir_colbertv2.0. Creating a new one with MEAN pooling. 2024-05-15 05:01:06,047 - INFO - ingest.py:168 - Loaded embeddings from colbert-ir/colbertv2.0 Traceback (most recent call last): File "/home/rob/localGPT/ingest.py", line 182, in main() File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 1157, in call return self.main(args, kwargs) File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 783, in invoke return __callback(args, **kwargs) File "/home/rob/localGPT/ingest.py", line 170, in main db = Chroma.from_documents( File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 613, in from_documents return cls.from_texts( File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 568, in from_texts chroma_collection = cls( File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/langchain/vectorstores/chroma.py", line 120, in init self._client = chromadb.Client(_client_settings) File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/init.py", line 143, in Client api = system.instance(API) File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 195, in instance impl = type(self) File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/api/segment.py", line 82, in init self._manager = self.require(SegmentManager) File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 134, in require inst = self._system.instance(type) File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 192, in instance type = get_class(fqn, type) File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/config.py", line 239, in get_class module = importlib.import_module(module_name) File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/importlib/init.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1050, in _gcd_import File "", line 1027, in _find_and_load File "", line 1006, in _find_and_load_unlocked File "", line 688, in _load_unlocked File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/segment/impl/manager/local.py", line 13, in from chromadb.segment.impl.vector.local_persistent_hnsw import ( File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/segment/impl/vector/local_persistent_hnsw.py", line 9, in from chromadb.segment.impl.vector.local_hnsw import ( File "/home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/chromadb/segment/impl/vector/local_hnsw.py", line 21, in import hnswlib ImportError: /home/rob/miniconda3/envs/localGPT/lib/python3.10/site-packages/hnswlib.cpython-310-x86_64-linux-gnu.so: undefined symbol: __cxa_call_terminate