opea-project / GenAIExamples

Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open Platform for Enterprise AI (OPEA) project.
https://opea.dev
Apache License 2.0
265 stars 185 forks source link

[Bug] InternalServerError by Dataprep Redis service #723

Closed arun-gupta closed 1 month ago

arun-gupta commented 2 months ago

Priority

Undecided

OS type

Other (Please let us know in description)

Hardware type

Xeon-SPR

Installation method

Deploy method

Running nodes

Single Node

What's the version?

Amazon Linux 2023 AMI v0.9 of Docker images

Description

Following the steps at https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/docker/xeon

DataPrep microservice gives Internal Server Error

The file exists on the local file system.

Reproduce steps

Get all the services running by following the steps at https://gist.github.com/arun-gupta/7e9f080feff664fbab878b26d13d83d7

Raw log

[ec2-user@ip-172-31-77-194 ~]$ ls -la
total 204
drwx------. 4 ec2-user ec2-user    179 Sep  3 18:25 .
drwxr-xr-x. 3 root     root         22 Sep  3 17:18 ..
-rw-------. 1 ec2-user ec2-user    543 Sep  3 17:29 .bash_history
-rw-r--r--. 1 ec2-user ec2-user     18 Jan 28  2023 .bash_logout
-rw-r--r--. 1 ec2-user ec2-user    141 Jan 28  2023 .bash_profile
-rw-r--r--. 1 ec2-user ec2-user    492 Jan 28  2023 .bashrc
-rw-r--r--. 1 ec2-user ec2-user    930 Sep  3 17:22 .env
drwx------. 2 ec2-user ec2-user     29 Sep  3 17:18 .ssh
-rw-------. 1 ec2-user ec2-user    917 Sep  3 17:22 .viminfo
-rw-r--r--. 1 ec2-user ec2-user   5523 Sep  3 17:22 compose.yaml
drwxr-xr-x. 7 root     root        169 Sep  3 17:27 data
-rw-r--r--. 1 ec2-user ec2-user 173976 Sep  3 18:25 nke-10k-2023.pdf
[ec2-user@ip-172-31-77-194 ~]$ curl -X POST "http://${host_ip}:6007/v1/dataprep"      -H "Content-Type: multipart/form-data"      -F "files=@./nke-10k-2023.pdf"
Internal Server Error[ec2-user@sudo docker logs dataprep-redis-server
/home/user/.local/lib/python3.11/site-packages/pydantic/_internal/_fields.py:161: UserWarning: Field "model_name_or_path" has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
  warnings.warn(
/home/user/.local/lib/python3.11/site-packages/langchain/__init__.py:30: UserWarning: Importing LLMChain from langchain root module is no longer supported. Please use langchain.chains.LLMChain instead.
  warnings.warn(
/home/user/.local/lib/python3.11/site-packages/langchain/__init__.py:30: UserWarning: Importing PromptTemplate from langchain root module is no longer supported. Please use langchain_core.prompts.PromptTemplate instead.
  warnings.warn(
[2024-09-03 17:27:52,246] [    INFO] - Base service - CORS is enabled.
[2024-09-03 17:27:52,247] [    INFO] - Base service - Setting up HTTP server
[2024-09-03 17:27:52,247] [    INFO] - Base service - Uvicorn server setup on port 6007
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:6007 (Press CTRL+C to quit)
[2024-09-03 17:27:52,251] [    INFO] - Base service - HTTP server setup successful
[2024-09-03 18:30:12,244] [    INFO] - prepare_doc_redis - [ upload ] File nke-10k-2023.pdf does not exist.
INFO:     172.31.77.194:43302 - "POST /v1/dataprep HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/user/.local/lib/python3.11/site-packages/pymupdf/__init__.py", line 2806, in __init__
    doc = mupdf.fz_open_document(filename)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/pymupdf/mupdf.py", line 44273, in fz_open_document
    return _mupdf.fz_open_document(filename)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pymupdf.mupdf.FzErrorFormat: code=7: no objects found

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/.local/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/home/user/.local/lib/python3.11/site-packages/prometheus_fastapi_instrumentator/middleware.py", line 174, in __call__
    raise exc
  File "/home/user/.local/lib/python3.11/site-packages/prometheus_fastapi_instrumentator/middleware.py", line 172, in __call__
    await self.app(scope, receive, send_wrapper)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in __call__
    await self.app(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 754, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 774, in app
    await route.handle(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 295, in handle
    await self.app(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 74, in app
    response = await f(request)
               ^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/comps/dataprep/redis/langchain/prepare_doc_redis.py", line 262, in ingest_documents
    ingest_data_to_redis(
  File "/home/user/comps/dataprep/redis/langchain/prepare_doc_redis.py", line 197, in ingest_data_to_redis
    content = document_loader(path)
              ^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/comps/dataprep/utils.py", line 332, in document_loader
    return load_pdf(doc_path)
           ^^^^^^^^^^^^^^^^^^
  File "/home/user/comps/dataprep/utils.py", line 111, in load_pdf
    doc = fitz.open(pdf_path)
          ^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/pymupdf/__init__.py", line 2809, in __init__
    raise FileDataError(f'Failed to open file {filename!r}.') from e
pymupdf.FileDataError: Failed to open file './uploaded_files/nke-10k-2023.pdf'.
[2024-09-03 18:43:24,642] [    INFO] - prepare_doc_redis - [ upload ] File nke-10k-2023.pdf does not exist.
INFO:     172.31.77.194:36982 - "POST /v1/dataprep HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/user/.local/lib/python3.11/site-packages/pymupdf/__init__.py", line 2806, in __init__
    doc = mupdf.fz_open_document(filename)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/pymupdf/mupdf.py", line 44273, in fz_open_document
    return _mupdf.fz_open_document(filename)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pymupdf.mupdf.FzErrorFormat: code=7: no objects found

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/.local/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/home/user/.local/lib/python3.11/site-packages/prometheus_fastapi_instrumentator/middleware.py", line 174, in __call__
    raise exc
  File "/home/user/.local/lib/python3.11/site-packages/prometheus_fastapi_instrumentator/middleware.py", line 172, in __call__
    await self.app(scope, receive, send_wrapper)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 85, in __call__
    await self.app(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 754, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 774, in app
    await route.handle(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 295, in handle
    await self.app(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/home/user/.local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/home/user/.local/lib/python3.11/site-packages/starlette/routing.py", line 74, in app
    response = await f(request)
               ^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/fastapi/routing.py", line 278, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/comps/dataprep/redis/langchain/prepare_doc_redis.py", line 262, in ingest_documents
    ingest_data_to_redis(
  File "/home/user/comps/dataprep/redis/langchain/prepare_doc_redis.py", line 197, in ingest_data_to_redis
    content = document_loader(path)
              ^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/comps/dataprep/utils.py", line 332, in document_loader
    return load_pdf(doc_path)
           ^^^^^^^^^^^^^^^^^^
  File "/home/user/comps/dataprep/utils.py", line 111, in load_pdf
    doc = fitz.open(pdf_path)
          ^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/pymupdf/__init__.py", line 2809, in __init__
    raise FileDataError(f'Failed to open file {filename!r}.') from e
pymupdf.FileDataError: Failed to open file './uploaded_files/nke-10k-2023.pdf'.
feng-intel commented 2 months ago

pymupdf.FileDataError: Failed to open file './uploaded_files/nke-10k-2023.pdf'.

Can you check this path './uploaded_files/nke-10k-2023.pdf' in your docker container ?

arun-gupta commented 2 months ago

Can you provide command for checking this path? @feng-intel

feng-intel commented 2 months ago

I suppose your docker image was built from ->

docker build --no-cache -t opea/dataprep-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/redis/langchain/docker/Dockerfile .

The docker file is here From this Dockerfile, the workdir is "/home/user/comps/dataprep/redis/langchain" You can

$ docker exec -it the_id_of_container_opea/dataprep-redis:latest /bin/bash

and to check if there is './uploaded_files/nke-10k-2023.pdf' under '/home/user/comps/dataprep/redis/langchain'

yinghu5 commented 1 month ago

Hi Arun, We reorg the path of opea in the days, could you please try the latest main branch and let us know if it works now? fixed the nke-10k-2023.pdf path issue by #804

feng-intel commented 1 month ago

Close issue. @arun-gupta Please open it if you still have the issue. Thanks.