opea-project / GenAIExamples

Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open Platform for Enterprise AI (OPEA) project.
https://opea.dev
Apache License 2.0
172 stars 81 forks source link

ChatQnA: Internal Server Error #322

Open eero-t opened 3 days ago

eero-t commented 3 days ago

Built & ran v0.6 of Xeon ChatQnA, following these instructions: https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/kubernetes/manifests/README.md

After running the verification query, changed one letter from the query message (2023 -> 2022): $ curl http://${chatqna_svc_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{"messages": "What is the revenue of Nike in 2022?"}'

And got: Internal Server Error

ChatQnA service log shows:

INFO:     10.7.106.43:47876 - "POST /v1/chatqna HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/requests/models.py", line 974, in json
    return complexjson.loads(self.text, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Reranking service log:

  File "/home/user/.local/lib/python3.11/site-packages/langsmith/run_helpers.py", line 562, in wrapper
    function_result = run_container["context"].run(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/comps/reranks/langchain/reranking_tei_xeon.py", line 43, in reranking
    best_response = max(response_data, key=lambda response: response["score"])
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/comps/reranks/langchain/reranking_tei_xeon.py", line 43, in <lambda>
    best_response = max(response_data, key=lambda response: response["score"])
                                                            ~~~~~~~~^^^^^^^^^
TypeError: string indices must be integers, not 'str'

tei-reranking: 2024-06-25T18:14:59.914279Z ERROR rerank:predict{inputs=("What is the revenue of Nike in 2022?", ... }: text_embeddings_core::infer: core/src/infer.rs:364: Input validation error:inputsmust have less than 512 tokens. Given: 545

eero-t commented 2 days ago

Same error also with the Git HEAD version of everything.

lvliang-intel commented 2 days ago

TEI's limitation requires inputs to be less than 512 tokens. This issue occurs when the length of retrieved documents exceeds this limit. To address this, we can implement a workaround in the retrieval microservice to ensure that the length of retrieved documents is limited to under 512 tokens.

eero-t commented 1 day ago

TEI's limitation requires inputs to be less than 512 tokens. This issue occurs when the length of retrieved documents exceeds this limit.

Yes, that was clear from the error message.

To address this, we can implement a workaround in the retrieval microservice to ensure that the length of retrieved documents is limited to under 512 tokens.

Thanks, I think that's acceptable, assuming it's really "... limited to the configured max number of tokens".

PS. Seeing "Internal Server Error" for trivial input variations like this, indicates that testing for potential errors is not yet at the level where I would expect it to be in my own projects. :-)