Closed gz-simon closed 1 month ago
The Full-Text Search feature in Dify version 0.6.14 may be returning empty results when self-hosted using Docker due to a potential issue with the Weaviate container. The deployment logs show that the Weaviate container is running, but there might be a configuration or connectivity issue specific to the self-hosted environment.
To troubleshoot, ensure that the Weaviate container is properly configured and accessible. You can check the logs of the Weaviate container for any errors:
docker logs docker-weaviate-1
Additionally, verify that the Weaviate service is correctly integrated with the Dify application and that it has access to the necessary data for indexing and searching [1].
To continue talking to Dosu, mention @dosu.
@dosu PS E:\dify2\dify\docker> docker logs docker-weaviate-1 {"action":"startup","default_vectorizer_module":"none","level":"info","msg":"the default vectorizer modules is set to \"none\", as a result all new schema classes without an explicit vectorizer setting, will use this vectorizer","time":"2024-07-19T10:12:41Z"} {"action":"startup","auto_schema_enabled":true,"level":"info","msg":"auto schema enabled setting is set to \"true\"","time":"2024-07-19T10:12:41Z"} {"action":"grpc_startup","level":"info","msg":"grpc server listening at [::]:50051","time":"2024-07-19T10:12:42Z"} {"action":"restapi_management","level":"info","msg":"Serving weaviate at http://[::]:8080","time":"2024-07-19T10:12:42Z"} {"action":"hnsw_vector_cache_prefill","count":25000,"index_id":"vector_index_64401738_1db4_4ee0_8aac_9f156568f022_node_QgLe6uhrio1J","level":"info","limit":1000000000000,"msg":"prefilled vector cache","t ime":"2024-07-19T10:18:02Z","took":960441} {"action":"restapi_management","level":"info","msg":"Shutting down... ","time":"2024-07-22T01:33:46Z"} {"action":"restapi_management","level":"info","msg":"Stopped serving weaviate at http://[::]:8080","time":"2024-07-22T01:33:46Z"} {"action":"startup","default_vectorizer_module":"none","level":"info","msg":"the default vectorizer modules is set to \"none\", as a result all new schema classes without an explicit vectorizer setting, will use this vectorizer","time":"2024-07-22T01:36:47Z"} {"action":"startup","auto_schema_enabled":true,"level":"info","msg":"auto schema enabled setting is set to \"true\"","time":"2024-07-22T01:36:47Z"} {"action":"hnsw_vector_cache_prefill","count":50000,"index_id":"vector_index_64401738_1db4_4ee0_8aac_9f156568f022_node_QgLe6uhrio1J","level":"info","limit":1000000000000,"msg":"prefilled vector cache","t ime":"2024-07-22T01:36:49Z","took":2111166} {"action":"grpc_startup","level":"info","msg":"grpc server listening at [::]:50051","time":"2024-07-22T01:36:49Z"} {"action":"restapi_management","level":"info","msg":"Serving weaviate at http://[::]:8080","time":"2024-07-22T01:36:49Z"} PS E:\dify2\dify\docker>
The issue with the Full-Text Search feature in Dify version 0.6.14 when self-hosted using Docker is likely due to the default vectorizer module being set to "none" in the Weaviate container. This means that new schema classes without an explicit vectorizer setting will not use any vectorizer, which is essential for full-text search functionality.
To resolve this issue, you need to configure Weaviate to use an appropriate vectorizer module. Here is how you can do it:
Edit the Weaviate Configuration: Update the Weaviate configuration to use a vectorizer module. You can do this by modifying the docker-compose.yml
file or the Weaviate configuration file.
Set the Vectorizer Module: Ensure that the vectorizer module is set to a suitable value, such as text2vec-contextionary
.
Here is an example of how you might update the docker-compose.yml
file to include the vectorizer module:
version: '3.7'
services:
weaviate:
image: semitechnologies/weaviate:1.18.4
environment:
- QUERY_DEFAULTS_LIMIT=20
- AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true
- PERSISTENCE_DATA_PATH=/var/lib/weaviate
- DEFAULT_VECTORIZER_MODULE=text2vec-contextionary
- ENABLE_MODULES=text2vec-contextionary
ports:
- "8080:8080"
volumes:
- weaviate_data:/var/lib/weaviate
volumes:
weaviate_data:
docker compose down
docker compose up -d
By setting the DEFAULT_VECTORIZER_MODULE
to text2vec-contextionary
and enabling the module, Weaviate will use the specified vectorizer for new schema classes, which should resolve the issue with the full-text search returning empty results [1].
To continue talking to Dosu, mention @dosu.
pls comfirm you are using the Qdrant as your vector db with tag:langgenius/qdrant @gz-simon
@dosu weaviate: image: semitechnologies/weaviate:1.25.8 ports:
PERSISTENCE_DATA_PATH: ${WEAVIATE_PERSISTENCE_DATA_PATH:-/var/lib/weaviate} QUERY_DEFAULTS_LIMIT: ${WEAVIATE_QUERY_DEFAULTS_LIMIT:-25} CONTEXTIONARY_URL: contextionary:9999 AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true' ENABLE_MODULES: 'text2vec-contextionary' DEFAULT_VECTORIZER_MODULE: 'text2vec-contextionary' CLUSTER_HOSTNAME: ${WEAVIATE_CLUSTER_HOSTNAME:-node1} CLUSTER_GOSSIP_BIND_PORT: '7100' CLUSTER_DATA_BIND_PORT: '7101' AUTHENTICATION_APIKEY_ENABLED: ${WEAVIATE_AUTHENTICATION_APIKEY_ENABLED:-true} AUTHENTICATION_APIKEY_ALLOWED_KEYS: ${WEAVIATE_AUTHENTICATION_APIKEY_ALLOWED_KEYS:-WVF5YThaHlkYwhGUSmCRgsX3tD5ngdN8pkih} AUTHENTICATION_APIKEY_USERS: ${WEAVIATE_AUTHENTICATION_APIKEY_USERS:-hello@dify.ai} AUTHORIZATION_ADMINLIST_ENABLED: ${WEAVIATE_AUTHORIZATION_ADMINLIST_ENABLED:-true} AUTHORIZATION_ADMINLIST_USERS: ${WEAVIATE_AUTHORIZATION_ADMINLIST_USERS:-hello@dify.ai} contextionary: image: semitechnologies/contextionary:en0.16.0-v1.2.1 ports:
The issue with the Full-Text Search feature in Dify version 0.6.14 when self-hosted using Docker is likely due to the DEFAULT_VECTORIZER_MODULE
being set to "none" in the Weaviate container. However, based on your provided configuration, it seems you have already set the DEFAULT_VECTORIZER_MODULE
to text2vec-contextionary
.
To further troubleshoot, please ensure the following:
text2vec-contextionary
module by checking the logs for any errors or warnings related to the vectorizer module.docker logs docker-weaviate-1
Schema Configuration: Ensure that the schema classes in Weaviate are explicitly set to use the text2vec-contextionary
vectorizer. You can do this by checking the schema configuration in Weaviate.
Data Indexing: Confirm that the data has been correctly indexed using the text2vec-contextionary
vectorizer. You can re-index the data if necessary.
Network Configuration: Ensure that the contextionary
service is accessible from the Weaviate container. You can test this by running a network check from within the Weaviate container.
docker exec -it docker-weaviate-1 curl http://contextionary:9999
If the issue persists, please provide the logs from the Weaviate container after restarting it with the updated configuration. This will help identify any specific errors or misconfigurations.
To continue talking to Dosu, mention @dosu.
weaviate isn't support full-text search , pls use qdrant instead, thanks @gz-simon
weaviate isn't support full-text search , pls use qdrant instead, thanks @gz-simon
Qdrant does not support full-text search for numerical data.
Self Checks
Dify version
0.6.14
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
When I deployed diff on docker, embedded the corresponding document, and then used the recall test, I found that the full-text search was empty. The same operation worked normally in the cloud.
✔️ Expected Behavior
No response
❌ Actual Behavior
No response