langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
53.52k stars 7.82k forks source link

Meta endpoint! Unexpected status code: 404 using weaviate vdb #10398

Open Derican opened 3 weeks ago

Derican commented 3 weeks ago

Self Checks

Dify version

v0.11.0

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

  1. use docker-compose to create dify
  2. add docs to knowledge, select high quality and openai embedding

✔️ Expected Behavior

docs are indexed

❌ Actual Behavior

Screenshot 2024-11-06 224858

dosubot[bot] commented 3 weeks ago

I found a similar unsolved discussion regarding a 404 error with Weaviate: Error: Response output is missing or does not contain embeddings [1].

Additionally, here are some potential solutions and considerations for resolving the 404 error when using Dify with Weaviate:

  1. Configuration Settings: Ensure that your Weaviate configuration settings are correct. You should have the following environment variables set:

    • VECTOR_STORE set to weaviate.
    • WEAVIATE_ENDPOINT pointing to the correct Weaviate endpoint (e.g., http://weaviate:8080).
    • Ensure that API key authentication is correctly configured if enabled [2].
  2. Network Configuration: Verify that your Docker network settings allow communication between Dify and Weaviate. Dify uses a custom Docker network for internal communication, so ensure that Weaviate is accessible on the default network [3].

  3. Error Handling: Dify tolerates 404 errors during certain operations, such as deleting non-existent objects. However, if you're encountering a 404 during indexing, it might indicate a misconfiguration or connectivity issue [4].

  4. Logs and Debugging: Check the logs for any error messages that might provide more context on the issue. Dify logs errors at the INFO level and above, which can help diagnose connectivity issues [5].

If these steps do not resolve the issue, consider reviewing the detailed configuration and network settings to ensure everything is set up correctly.

To continue talking to Dosu, mention @dosu.

Derican commented 3 weeks ago

I have mentioned info in Issue#3216 and #734 , but neither of the solutions worked. The network to openai is tested and it worked, proxy setting is in #9960 . Logs of worker and weaviate are pasted below: image image

crazywoola commented 3 weeks ago

Can you attach the results of docker ps.

Derican commented 3 weeks ago

Can you attach the results of docker ps.

image

Derican commented 3 weeks ago

I figure out by adding networks to weaviate:

weaviate:
    image: semitechnologies/weaviate:1.19.0
    profiles:
      - ''
      - weaviate
    restart: always
    volumes:
      # Mount the Weaviate data directory to the con tainer.
      - ./volumes/weaviate:/var/lib/weaviate
    environment:
      # The Weaviate configurations
      # You can refer to the [Weaviate](https://weaviate.io/developers/weaviate/config-refs/env-vars) documentation for more information.
      PERSISTENCE_DATA_PATH: ${WEAVIATE_PERSISTENCE_DATA_PATH:-/var/lib/weaviate}
      QUERY_DEFAULTS_LIMIT: ${WEAVIATE_QUERY_DEFAULTS_LIMIT:-25}
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: ${WEAVIATE_AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED:-false}
      DEFAULT_VECTORIZER_MODULE: ${WEAVIATE_DEFAULT_VECTORIZER_MODULE:-none}
      CLUSTER_HOSTNAME: ${WEAVIATE_CLUSTER_HOSTNAME:-node1}
      AUTHENTICATION_APIKEY_ENABLED: ${WEAVIATE_AUTHENTICATION_APIKEY_ENABLED:-true}
      AUTHENTICATION_APIKEY_ALLOWED_KEYS: ${WEAVIATE_AUTHENTICATION_APIKEY_ALLOWED_KEYS:-WVF5YThaHlkYwhGUSmCRgsX3tD5ngdN8pkih}
      AUTHENTICATION_APIKEY_USERS: ${WEAVIATE_AUTHENTICATION_APIKEY_USERS:-hello@dify.ai}
      AUTHORIZATION_ADMINLIST_ENABLED: ${WEAVIATE_AUTHORIZATION_ADMINLIST_ENABLED:-true}
      AUTHORIZATION_ADMINLIST_USERS: ${WEAVIATE_AUTHORIZATION_ADMINLIST_USERS:-hello@dify.ai}
    networks:
      - ssrf_proxy_network
      - default

and adding no_proxy to api and worker:

worker:
    image: langgenius/dify-api:0.11.0
    restart: always
    environment:
      # Use the shared environment variables.
      <<: *shared-api-worker-env
      # Startup mode, 'worker' starts the Celery worker for processing the queue.
      MODE: worker
      HTTP_PROXY: ${SANDBOX_HTTP_PROXY:-http://ssrf_proxy:3128}
      HTTPS_PROXY: ${SANDBOX_HTTPS_PROXY:-http://ssrf_proxy:3128}
      no_proxy: 'localhost,127.0.0.1,172.*.*.*,.local,weaviate'

And be aware that knowledge base must be re-created as the index of previously created base will not work on retrying indexing.