Error running Ingest and PGPT_PROFILES=local make run

jtedsmith commented 10 months ago

Last week I decided to go from the primordial version to the new version of privateGPT. While it took some time to get it running right, more so user error than the directions or anything, I got it up and running. The server will run and it will ingest in batch. The issue I have is I am trying to load around 2000 documents in to it, so I will start to ingest. Some PDF files it spits out errors. Some it allows. Although, every once in a while, I receive this same error output. Usually if it happens on the ingest, the localhost:8001 server won't run anymore and it spits out a similar error. Sometimes it happens on the 'local make run' and then the ingest errors begin to happen. I have rebuilt it multiple times, and it works for a while. Although, then the problem becomes I have to start ingesting from scratch. I've tried to copy the local_data folders contents in to the new directory clone, but that just spits out the error again too. Which makes me wonder if I am sucking up a bad file and it is causing an issue, but its almost impossible to determine that.

I am running the following: Server with Ubuntu 22.04 CPU: AMD Ryzen 7 5700X 8-Core Processor RAM: 32 Gig GPU: GeForce GTX 1660 SUPER GeForce RTX 3060 Lite Hash Rate pyenv=3.11.3 (but have tried 3.11.5 and 3.11.6 same results)

Errors that I keep receiving are as follows:

~/privateGPT$ PGPT_PROFILES=local make run poetry run python -m private_gpt Starting application with profiles: ['default', 'local'] ggml_init_cublas: found 2 CUDA devices: Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6 Device 1: NVIDIA GeForce GTX 1660 SUPER, compute capability 7.5 llama_model_loader: loaded meta data with 20 key-value pairs and 291 tensors from /home/user/privateGPT/models/mistral-7b-instruct-v0.1.Q4_K_M.gguf (version GGUF V2 (latest)) llama_model_loader: - tensor 0: token_embd.weight q4_K [ 4096, 32000, 1, 1 ] llama_model_loader: - tensor 1: blk.0.attn_q.weight q4_K [ 4096, 4096, 1, 1 ...

............................................................................................... llama_new_context_with_model: n_ctx = 3900 llama_new_context_with_model: freq_base = 10000.0 llama_new_context_with_model: freq_scale = 1 llama_kv_cache_init: offloading v cache to GPU llama_kv_cache_init: offloading k cache to GPU llama_kv_cache_init: VRAM kv self = 487.50 MB llama_new_context_with_model: kv self size = 487.50 MB llama_new_context_with_model: compute buffer total size = 281.25 MB llama_new_context_with_model: VRAM scratch buffer: 275.37 MB llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB) AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "/home/user/privateGPT/private_gpt/main.py", line 5, in from private_gpt.main import app File "/home/user/privateGPT/private_gpt/main.py", line 123, in from private_gpt.ui.ui import mount_in_app File "/home/user/privateGPT/private_gpt/ui/ui.py", line 19, in ingest_service = root_injector.get(IngestService) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 91, in wrapper return function(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 974, in get provider_instance = scope_instance.get(interface, binding.provider) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 91, in wrapper return function(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 800, in get instance = self._get_instance(key, provider, self.injector) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 811, in _get_instance return provider.get(injector) ^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 264, in get return injector.create_object(self._cls) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 998, in create_object self.call_withinjection(init, self=instance, kwargs=additional_kwargs) File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 1031, in call_with_injection dependencies = self.args_to_inject( ^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 91, in wrapper return function(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 1079, in args_to_inject instance: Any = self.get(interface) ^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 91, in wrapper return function(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 974, in get provider_instance = scope_instance.get(interface, binding.provider) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 91, in wrapper return function(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 800, in get instance = self._get_instance(key, provider, self.injector) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 811, in _get_instance return provider.get(injector) ^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 264, in get return injector.create_object(self._cls) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 998, in create_object self.call_withinjection(init, self=instance, kwargs=additional_kwargs) File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 1040, in call_with_injection return callable(full_args, dependencies) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/privateGPT/private_gpt/components/node_store/node_store_component.py", line 24, in init self.doc_store = SimpleDocumentStore.from_persist_dir( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/llama_index/storage/docstore/simple_docstore.py", line 56, in from_persist_dir return cls.from_persist_path(persist_path, namespace=namespace, fs=fs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/llama_index/storage/docstore/simple_docstore.py", line 73, in from_persist_path simple_kvstore = SimpleKVStore.from_persist_path(persist_path, fs=fs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/llama_index/storage/kvstore/simple_kvstore.py", line 76, in from_persist_path data = json.load(f) ^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/lib/python3.11/json/init.py", line 293, in load return loads(fp.read(), ^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/lib/python3.11/json/init.py", line 346, in loads return _default_decoder.decode(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/lib/python3.11/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/lib/python3.11/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) make: *** [Makefile:36: run] Error 1

If I try to Ingest files using the batch method, I get a similar error: ~/privateGPT$ make ingest /home/user/Documents/Ingest -- --watch Starting application with profiles: ['default'] Traceback (most recent call last): File "/home/user/privateGPT/scripts/ingest_folder.py", line 9, in ingest_service = root_injector.get(IngestService) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 91, in wrapper return function(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 974, in get provider_instance = scope_instance.get(interface, binding.provider) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 91, in wrapper return function(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 800, in get instance = self._get_instance(key, provider, self.injector) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 811, in _get_instance return provider.get(injector) ^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 264, in get return injector.create_object(self._cls) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 998, in create_object self.call_withinjection(init, self=instance, kwargs=additional_kwargs) File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 1031, in call_with_injection dependencies = self.args_to_inject( ^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 91, in wrapper return function(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 1079, in args_to_inject instance: Any = self.get(interface) ^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 91, in wrapper return function(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 974, in get provider_instance = scope_instance.get(interface, binding.provider) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 91, in wrapper return function(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 800, in get instance = self._get_instance(key, provider, self.injector) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 811, in _get_instance return provider.get(injector) ^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 264, in get return injector.create_object(self._cls) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 998, in create_object self.call_withinjection(init, self=instance, kwargs=additional_kwargs) File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/injector/init.py", line 1040, in call_with_injection return callable(full_args, dependencies) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/privateGPT/private_gpt/components/node_store/node_store_component.py", line 24, in init self.doc_store = SimpleDocumentStore.from_persist_dir( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/llama_index/storage/docstore/simple_docstore.py", line 56, in from_persist_dir return cls.from_persist_path(persist_path, namespace=namespace, fs=fs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/llama_index/storage/docstore/simple_docstore.py", line 73, in from_persist_path simple_kvstore = SimpleKVStore.from_persist_path(persist_path, fs=fs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/envs/privateGPT/lib/python3.11/site-packages/llama_index/storage/kvstore/simple_kvstore.py", line 76, in from_persist_path data = json.load(f) ^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/lib/python3.11/json/init.py", line 293, in load return loads(fp.read(), ^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/lib/python3.11/json/init.py", line 346, in loads return _default_decoder.decode(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/lib/python3.11/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/.pyenv/versions/3.11.3/lib/python3.11/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) make: *** [Makefile:52: ingest] Error 1

If anyone has thoughts on why this keeps happening, I would be very interested to hear why. I am starting to venture in to the definition of insanity.

imartinez commented 10 months ago

Apparently the error is being thrown by the SimpleDocumentStore, which is a file system storage of a JSON representation of all nodes, effectively representing all ingested documents chunks and their relations. It is stored in local_data/private_gpt/docstore.json (and 2 other JSONs).

Apparently the stored files at some point get corrupted, and when privateGPT tries to load them again from disk as a JSON, the JSON decoder fails. If you are ingesting lots of documents maybe the file is getting too large... not sure.

That approach (file system storage) is ok for testing PrivateGPT out, but for more serious use cases you may want to implement an actual storage option like a structured database (postgres for example). We'll be sharing more about production setups in the coming days.

jtedsmith commented 10 months ago

Thank you, @imartinez! I'll take a look at adding my own data store. Appreciate you looking in to it.

Jawn78 commented 10 months ago

@jtedsmith would love to hear more about what you implement

lopagela commented 10 months ago

@jtedsmith Could you give us some details on the JSON file that represent the document store? Like, it's size (using ls -lh or du -hs) and the first 10 or 20 characters (head -c 20) 🙏

The JSON decoder failing on the first character (line 1 column 1 (char 0)) makes me perplex 🤔

Could you also share the DEBUG logs please? (To enable DEBUG log, you have to change the log level in private_gpt/__init__.py

jtedsmith commented 10 months ago

@lopagela I would, but I have done 3 rebuilds since the last time and don't have any record of the previous builds. Today I am in a more perplexing spot, but I can create a new ticket if it comes to that. I've lost connectivity to my GPUs. The 'CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python' command fails with this error:

Downloading llama_cpp_python-0.2.14.tar.gz (7.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.2/7.2 MB 31.9 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Installing backend dependencies ... error error: subprocess-exited-with-error

× pip subprocess to install backend dependencies did not run successfully. │ exit code: 2 ╰─> [70 lines of output] Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com, https://pypi.ngc.nvidia.com Collecting cmake>=3.21 Obtaining dependency information for cmake>=3.21 from https://files.pythonhosted.org/packages/5a/e1/001da8b79b5d336d42aee95aae4cb934348ffa8925a6280fcd81859d8734/cmake-3.27.7-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata Downloading cmake-3.27.7-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (6.7 kB) Collecting ninja>=1.5 Obtaining dependency information for ninja>=1.5 from https://files.pythonhosted.org/packages/6d/92/8d7aebd4430ab5ff65df2bfee6d5745f95c004284db2d8ca76dcbfd9de47/ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl.metadata Downloading ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl.metadata (5.3 kB) Downloading cmake-3.27.7-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (26.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 21.9/26.0 MB 8.3 MB/s eta 0:00:01


  ERROR: Exception:
      Traceback (most recent call last):
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_vendor/urllib3/response.py", line 438, in _error_catcher
          yield
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_vendor/urllib3/response.py", line 561, in read
          data = self._fp_read(amt) if not fp_closed else b""
                 ^^^^^^^^^^^^^^^^^^
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_vendor/urllib3/response.py", line 527, in _fp_read
          return self._fp.read(amt) if amt is not None else self._fp.read()
                 ^^^^^^^^^^^^^^^^^^
        File "/usr/lib/python3.11/http/client.py", line 466, in read
          s = self.fp.read(amt)
              ^^^^^^^^^^^^^^^^^
        File "/usr/lib/python3.11/socket.py", line 706, in readinto
          return self._sock.recv_into(b)
                 ^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/lib/python3.11/ssl.py", line 1311, in recv_into
          return self.read(nbytes, buffer)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/lib/python3.11/ssl.py", line 1167, in read
          return self._sslobj.read(len, buffer)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      ssl.SSLError: [SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:2580)

      During handling of the above exception, another exception occurred:

      Traceback (most recent call last):
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_internal/cli/base_command.py", line 180, in exc_logging_wrapper
          status = run_func(*args)
                   ^^^^^^^^^^^^^^^
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_internal/cli/req_command.py", line 248, in wrapper
          return func(self, options, args)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_internal/commands/install.py", line 377, in run
          requirement_set = resolver.resolve(
                            ^^^^^^^^^^^^^^^^^
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 161, in resolve
          self.factory.preparer.prepare_linked_requirements_more(reqs)
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_internal/operations/prepare.py", line 565, in prepare_linked_requirements_more
          self._complete_partial_requirements(
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_internal/operations/prepare.py", line 479, in _complete_partial_requirements
          for link, (filepath, _) in batch_download:
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_internal/network/download.py", line 183, in __call__
          for chunk in chunks:
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_internal/cli/progress_bars.py", line 53, in _rich_progress_bar
          for chunk in iterable:
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_internal/network/utils.py", line 63, in response_chunks
          for chunk in response.raw.stream(
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_vendor/urllib3/response.py", line 622, in stream
          data = self.read(amt=amt, decode_content=decode_content)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_vendor/urllib3/response.py", line 560, in read
          with self._error_catcher():
        File "/usr/lib/python3.11/contextlib.py", line 155, in __exit__
          self.gen.throw(typ, value, traceback)
        File "/home/jtedsmith/.cache/pypoetry/virtualenvs/private-gpt-2fJe892j-py3.11/lib/python3.11/site-packages/pip/_vendor/urllib3/response.py", line 449, in _error_catcher
          raise SSLError(e)
      pip._vendor.urllib3.exceptions.SSLError: [SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:2580)

      [notice] A new release of pip is available: 23.2.1 -> 23.3.1
      [notice] To update, run: pip install --upgrade pip
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× pip subprocess to install backend dependencies did not run successfully.
│ exit code: 2
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: pip install --upgrade pip
jtedsmith@gpuai:~/privateGPT$ /mnt/c/dev/git/github/privateGPT$ export CMAKE_ARGS="-DLLAMA_CUBLAS=on"
-bash: /mnt/c/dev/git/github/privateGPT$: No such file or directory

lopagela commented 10 months ago

It looks like it is solely related to your pip and your internet connectivity... I can't help you with this...

jtedsmith commented 10 months ago

Yea, nothing has changed with my pip or internet connectivity, but for some reason my GPUs are no longer recognized and this error is new. If I am the only one seeing this, then so be it, but if nothing has changed on my side, its odd that this just started happening. In addition, the error specifically says, "note: This error originates from a subprocess, and is likely not a problem with pip.", so I find it odd that you are pointing to PIP and no clue why it has anything to do with my Internet connection.

lopagela commented 10 months ago

@jtedsmith solely based on your stack trace, this is my conclusion. If you inspect the stack trace, you can find that it is purely coming from pip trying to download something.

Did you try to run pip in verbose mode? pip -vvv ...? It will show you everything it is doing, including the downloading and wheels construction (compilations).

If you cannot find the root cause with verbose mode of pip, please share it.

Jawn78 commented 10 months ago

@jtedsmith @lopagela

When decoding JSON, if the parser fails at the first character, it often indicates a mismatch between the expected and actual format of the JSON data. For instance, the issue might be that the JSON is not starting with a { (indicating an object) as expected, but with a [ (indicating an array). This can happen if the JSON object is unexpectedly nested inside another object or an array. This is just a thought about the JSONDEcodeError

However, it could be a dependency injection error, missing or malfunctioning dependency, missing or corrupted file, permissions, etc.

fotijr commented 10 months ago

In case it helps others, I saw the same JSONDecodeError error when setting up PrivateGPT on my Windows machine. It was working for a time, then I got the error and couldn't ingest documents or even run make run. Thanks to @imartinez's tip about the JSON doc storage, I opened the docstore.json file and saw it was empty. I changed the file contents to be {}, and it started working again.

Now that I'm learning how things actually work, seems pretty clear ingesting thousands of documents isn't going to work with SimpleDocumentStore, since it stores all documents as one file 😂 But still leaving this here so others know where to look to clean things up and get their install running again.

github-actions[bot] commented 9 months ago

Stale issue

zylon-ai / private-gpt

Error running Ingest and PGPT_PROFILES=local make run #1131