weaviate / Verba

Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
BSD 3-Clause "New" or "Revised" License
6.36k stars 688 forks source link

Vectorization failed 404 "/api/embed" Not Found #315

Open Kieran-Sears opened 2 weeks ago

Kieran-Sears commented 2 weeks ago

Description

Configuration URL values given cause issues if trailing forward slashes are present. See forum discussion for more details.

Installation

Weaviate Deployment

Configuration

Reader: default Chunker: default Embedder: Ollama (llama 3.1) Retriever: default Generator: Ollama (llama 3.1)

Steps to Reproduce

  1. set docker compose environment variables:

    • OLLAMA_URL=http://ollama:11434/
    • OLLAMA_MODEL=llama3.1:latest
    • OLLAMA_EMBED_MODEL=llama3.1:latest
  2. ensure Ollama is running with a valid model:

    • docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
    • docker exec -it ollama ollama run llama3.1
  3. start verba from the verba project root directory, pointing to the location of your .env file

    • docker compose --env-file <verba_env_path> up -d --build
  4. connect Ollama to the verba network as it was started as a separate process:

    • docker network connect "verba_default" "ollama"
  5. open UI at http://localhost:8000 choose docker deployment option, go to "import data" tab, and attempt to import a file.

  6. check verba container logs docker logs verba-verba-1 and expect to find:

    Batch vectorization failed: Vectorization
    failed for some batches: 404, message='Not Found',
    url=URL('http://host.docker.internal:11434/api/embed')

Additional context

I've used Ollama separately run in it's own container rather than having it as part of docker compose, this requires that it is connected to the verba network once it's up and running. This is how I'm able to set the OLLAMA_URL to the ollama domain rather than host.docker.internal as stated in the documentation. This just looked cleaner to me and allows me to keep my Ollama instance up and running for other uses on my machine (e.g. intellisense, personal AI assistant, etc), but the exact same applies when using host.docker.internal as well, a trailing slash will cause failure when the frontend attempts to connect to an endpoint.

dgonzo commented 1 week ago

Same issue for me.

Batch vectorization failed: Vectorization failed for some batches: 404, message='Not Found', url=URL('http://localhost:11434/api/embed')

For reference when I inference my local ollama the endpoint is api/embeddings e.g.:

curl -X POST http://localhost:11434/api/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nomic-embed-text",
    "prompt": "Hello world"
  }'