opea-project / GenAIExamples

Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open Platform for Enterprise AI (OPEA) project.
https://opea.dev
Apache License 2.0
251 stars 173 forks source link

[Bug] chatqna-retriever-usvc status is CrashLoopBackOff when deploy ProductivitySuite #889

Open shaohef opened 3 weeks ago

shaohef commented 3 weeks ago

Priority

P1-Stopper

OS type

Ubuntu

Hardware type

Xeon-GNR

Installation method

Deploy method

Running nodes

Single Node

What's the version?

latest

Description

chatqna-retriever-usvc status is CrashLoopBackOff when deploy ProductivitySuite

Reproduce steps

Follow the official guide to deployment

Raw log

kubectl get pods
NAME                                           READY   STATUS             RESTARTS         AGE
chat-history-58586b84bc-h75l9                  1/1     Running            0                3h13m
chatqna-55d944bdc9-br68s                       1/1     Running            0                3h11m
chatqna-data-prep-7888b6fccc-nf9vw             1/1     Running            0                3h11m
chatqna-embedding-usvc-6556d4bbd7-9f7xk        1/1     Running            0                3h11m
chatqna-llm-uservice-589d8f9f86-mshsv          0/1     Running            28 (7m38s ago)   3h11m
chatqna-redis-vector-db-798f474769-47dfb       1/1     Running            0                3h11m
chatqna-reranking-usvc-776b485f7c-7jfb5        1/1     Running            0                3h11m
chatqna-retriever-usvc-5dd8c69cf9-p4hrt        0/1     CrashLoopBackOff   5 (119s ago)     9m23s

kubectl logs chatqna-retriever-usvc-5dd8c69cf9-p4hrt
/home/user/.local/lib/python3.11/site-packages/pydantic/_internal/_fields.py:132: UserWarning: Field "model_name" in HuggingFaceInferenceAPIEmbeddings has conflict with protected namespac$
 "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
  warnings.warn(
/home/user/.local/lib/python3.11/site-packages/pydantic/_internal/_fields.py:132: UserWarning: Field "model_name_or_path" in Audio2TextDoc has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
  warnings.warn(
[2024-09-29 19:46:58,119] [    INFO] - Base service - CORS is enabled.
[2024-09-29 19:46:58,120] [    INFO] - Base service - Setting up HTTP server
[2024-09-29 19:46:58,121] [    INFO] - Base service - Uvicorn server setup on port 7000
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:7000 (Press CTRL+C to quit)

[2024-09-29 19:46:58,126] [    INFO] - Base service - HTTP server setup successful
/home/user/comps/retrievers/redis/langchain/retriever_redis.py:106: LangChainDeprecationWarning: The class `HuggingFaceHubEmbeddings` was deprecated in LangChain 0.2.2 and will be removed in 1.0. An updated version of the class exists in the langchain-huggingface package and should be used instead. To use it run `pip install -U langchain-huggingface` and import as `from langchain_huggingface import HuggingFaceEndpointEmbeddings`.
  embeddings = HuggingFaceHubEmbeddings(model=tei_embedding_endpoint)
Traceback (most recent call last):
  File "/home/user/.local/lib/python3.11/site-packages/redis/connection.py", line 277, in connect
    sock = self.retry.call_with_retry(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/redis/retry.py", line 62, in call_with_retry
    return do()
           ^^^^
  File "/home/user/.local/lib/python3.11/site-packages/redis/connection.py", line 278, in <lambda>
    lambda: self._connect(), lambda error: self.disconnect(error)
            ^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/redis/connection.py", line 607, in _connect
    for res in socket.getaddrinfo(
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/socket.py", line 974, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/comps/retrievers/redis/langchain/retriever_redis.py", line 111, in <module>
    vector_db = Redis(embedding=embeddings, index_name=INDEX_NAME, redis_url=REDIS_URL)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/langchain_community/vectorstores/redis/base.py", line 309, in __init__
    check_redis_module_exist(redis_client, REDIS_REQUIRED_MODULES)
  File "/home/user/.local/lib/python3.11/site-packages/langchain_community/utilities/redis.py", line 55, in check_redis_module_exist
    installed_modules = client.module_list()
                        ^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/redis/commands/core.py", line 6250, in module_list
    return self.execute_command("MODULE LIST")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/redis/client.py", line 545, in execute_command
    conn = self.connection or pool.get_connection(command_name, **options)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/redis/connection.py", line 1074, in get_connection
    connection.connect()
  File "/home/user/.local/lib/python3.11/site-packages/redis/connection.py", line 283, in connect
    raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error -3 connecting to chatqna-redis-vector-db:6379. Temporary failure in name resolution.
devpramod commented 1 week ago

Hi @shaohef This issue is being tracked in #890