mrepetto-certx commented 8 months ago

I tried to run docker compose run --rm --entrypoint="bash -c '[ -f scripts/setup ] && scripts/setup'" private-gpt

In a compose file somewhat similar to the repo:

version: '3'
services:
  private-gpt:
    image: marcorepettocertx/privategpt:0.4.0
    volumes:
      - ./private-gpt/local_data/:/home/worker/app/local_data
      - ./private-gpt/models/:/home/worker/app/models
    ports:
      - 8001:8080
    environment:
      PORT: 8080
      PGPT_PROFILES: docker
      PGPT_MODE: local

But I got in return the following error:

10:25:57.921 [INFO    ] private_gpt.settings.settings_loader - Starting application with profiles=['default', 'docker']
Traceback (most recent call last):
  File "/home/worker/app/scripts/setup", line 8, in <module>
    from private_gpt.paths import models_path, models_cache_path
  File "/home/worker/app/private_gpt/paths.py", line 4, in <module>
    from private_gpt.settings.settings import settings
  File "/home/worker/app/private_gpt/settings/settings.py", line 392, in <module>
    unsafe_typed_settings = Settings(**unsafe_settings)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/worker/app/.venv/lib/python3.11/site-packages/pydantic/main.py", line 164, in __init__
    __pydantic_self__.__pydantic_validator__.validate_python(data, self_instance=__pydantic_self__)
pydantic_core._pydantic_core.ValidationError: 2 validation errors for Settings
llm.mode
  Input should be 'llamacpp', 'openai', 'openailike', 'azopenai', 'sagemaker', 'mock' or 'ollama' [type=literal_error, input_value='local', input_type=str]
    For further information visit https://errors.pydantic.dev/2.5/v/literal_error
embedding.mode
  Input should be 'huggingface', 'openai', 'azopenai', 'sagemaker', 'ollama' or 'mock' [type=literal_error, input_value='local', input_type=str]
    For further information visit https://errors.pydantic.dev/2.5/v/literal_error

mrepetto-certx commented 8 months ago

Repeating the same by simply copying the repo and following #1445 causes the same problem, plus the following:

There was a problem when trying to write in your cache folder (/nonexistent/.cache/huggingface/hub). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.

Alexander1177 commented 8 months ago

@mrepetto-certx, can you be more specific please? I have the same issue

mrepetto-certx commented 8 months ago

@mrepetto-certx, can you be more specific please? I have the same issue

Well. To reproduce:

git clone https://github.com/imartinez/privateGPT
cd PrivateGPT
docker compose --build
docker compose run --rm --entrypoint="bash -c '[ -f scripts/setup ] && scripts/setup'" private-gpt

I do not know how to be more specific than that.

mrepetto-certx commented 8 months ago

I think local should be substituted with ollama https://github.com/imartinez/privateGPT/commit/45f05711eb71ffccdedb26f37e680ced55795d44

1445 is not taking into account this last change.

mrepetto-certx commented 8 months ago

Indeed

services:
  private-gpt:
    build:
      dockerfile: Dockerfile.local
    volumes:
      - ./local_data/:/home/worker/app/local_data
      - ./models/:/home/worker/app/models
    ports:
      - 8001:8080
    environment:
      PORT: 8080
      PGPT_PROFILES: docker
      PGPT_MODE: llamacpp

quite works. But still requires an embedding mode, which is different from llamacpp.

clipod commented 8 months ago

I am still getting the same error even when I change to llamacpp. Should I do any prerequisite before doing docker-compose build such as setting any env variables. Downloading any modules etc.?

mrepetto-certx commented 8 months ago

Unfortunately I got your same result. The problem is given by the split in between the llm and embedding lines in local settings file.

I suggest using ollama and compose an additional container into the compose file.

Da: venkat chinni @.> Inviato: Friday, March 22, 2024 7:11:26 AM A: zylon-ai/private-gpt @.> Cc: Marco Repetto @.>; Mention @.> Oggetto: Re: [zylon-ai/private-gpt] Pydantic validation error with ['default', 'docker'] (Issue #1756)

I am still getting the same error even when I change to llamacpp. Should I do any prerequisite before doing docker-compose build such as setting any env variables. Downloading any modules etc.?

— Reply to this email directly, view it on GitHubhttps://github.com/zylon-ai/private-gpt/issues/1756#issuecomment-2014424322, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A4PXYPP7MU6QOQT4R6DPTT3YZPDQ5AVCNFSM6AAAAABE3J4VL6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJUGQZDIMZSGI. You are receiving this because you were mentioned.Message ID: @.***>

makeSmartio commented 7 months ago

I think I can help a little, if you are trying to use Ollama, which you will need to get installed and running first, then make these changes settings.yaml: from localhost to host.docker.internal here: ollama: llm_model: llama2 embedding_model: nomic-embed-text api_base: http://host.docker.internal:11434

In docker-compose.yaml change dockerfile: Dockerfile.local to dockerfile: Dockerfile.external

in Dockerfile.external add these extras: RUN poetry install --extras "ui vector-stores-qdrant llms-ollama embeddings-ollama"

then do a docker compose build and then docker compose up

you will probably need to run ollama pull nomic-embed-text if you get the error about not having nomic

I hope this helps. I was able to finally get it running on my M2 MacBook Air.

yeetesh commented 7 months ago

I made these changes:

RUN poetry install --extras "ui embeddings-huggingface llms-llama-cpp vector-stores-qdrant llms-ollama embeddings-ollama" in my Dockerfile.local
set PGPT_MODE: ollama in my docker-compose.
downloaded ollama docker image and ran it separately
ran ollama pull nomic-embed-text in my ollama docker container.

I am still facing this issue: requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xffff4cd571d0>: Failed to establish a new connection: [Errno 111] Connection refused'))

My ollama server is running however when i get http://localhost:11434/api/embeddings, i get a 404. Any ideas on this? @makeSmartio

makeSmartio commented 7 months ago

What about step 1 with changing localhost to: api_base: http://host.docker.internal:11434/ in the file settings.yaml. The problem with localhost is that the docker container thinks it is localhost (it is.) host.docker.internal is the host address from the container's point of view

I also get a 404 for http://localhost:11434/api/embeddings, so no issue there.

mrepetto-certx commented 7 months ago

What is your take on decupling it in a way that ollama is used as a microservice? Something like:

services:
  private-gpt:
    build:
      dockerfile: Dockerfile.local
    volumes:
      - ./local_data/:/home/worker/app/local_data
    ports:
      - 8001:8080
    environment:
      PORT: 8080
      PGPT_PROFILES: docker
      PGPT_MODE: ollama
 ollama:
  build:
   image: ollama/ollama
  command: ollama pull nomic-embed-text

With the settings-ollama.yaml:

server:
  env_name: ${APP_ENV:ollama}

llm:
  mode: ollama
  max_new_tokens: 512
  context_window: 3900
  temperature: 0.1     #The temperature of the model. Increasing the temperature will make the model answer more creatively. A value of 0.1 would be more factual. (Default: 0.1)

embedding:
  mode: ollama

ollama:
  llm_model: mistral
  embedding_model: nomic-embed-text
  api_base: http://ollama:11434
  tfs_z: 1.0              # Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting.
  top_k: 40               # Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)
  top_p: 0.9              # Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)
  repeat_last_n: 64       # Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)
  repeat_penalty: 1.2     # Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)
  request_timeout: 120.0  # Time elapsed until ollama times out the request. Default is 120s. Format is float.

vectorstore:
  database: qdrant

qdrant:
  path: local_data/private_gpt/qdrant

makeSmartio commented 7 months ago

@mrepetto-certx Makes sense to me. Even if people already have Ollama installed this would just be another instance. You'd still need to tackle the addressing problem, though - it would either need to be http://host.docker.internal:11434/ for host installations or http://ollama:11434/ for Dockerized.

Edit: It would also take quite a bit of testing for adding the llm and embedding models for the dockerized method

mrepetto-certx commented 7 months ago

Thanks @makeSmartio. I'm experimenting now with the caveat of having:

  ollama:
    image: ollama/ollama:latest
    volumes:
      - ./ollama:/root/.ollama

To avoid the problem of pulling a new model every docker compose. I'll keep you posted.

mrepetto-certx commented 7 months ago

No way I keep getting:

[WARNING ] llama_index.core.chat_engine.types - Encountered exception writing response to history: [Errno 99] Cannot assign requested address

What is puzzling is that running:

from llama_index.llms.ollama import Ollama
model = Ollama(model="mistral", base_url="http://ollama:11434", request_timeout=120.0)
resp = model.complete("Who is Paul Graham?")
print(resp)

inside the container works.

mrepetto-certx commented 7 months ago

Ok, I managed to make it work and pushed a pull request #1812 . The only thing to remember is to run ollama pull the first time to load the models but then they will stay in the host environment, similar to the previous behavior.

zylon-ai / private-gpt

Pydantic validation error with ['default', 'docker'] #1756

1445 is not taking into account this last change.