cvpaperchallenge / Crux

Crux is a suite of LLM-empowered summarization and retrieval services for academic activity. Crux is developed by XCCV group of cvpaper.challenge.
MIT License
14 stars 2 forks source link

Enhancement/add rag basics #29

Closed YoshikiKubotani closed 4 months ago

YoshikiKubotani commented 5 months ago

Issue URL

NA

Change overview

How to test

1. Prerequisite

  1. Prepare the API keys for OpenAI API, Mathpix OCR, and Qdrant Cloud
  2. Create your own environments/.env referring environments/.env.sample and specify the keys above as environment variables
  3. Download the OCR-ed paper data from here and place as below.

    ├── data/
    │    │
    │    ├── test_papers/
    │    │    │
    │    │    ├── 01_Action-Conditioned_3D_Human_Motion_Synthesis_with_Transformer_VAE/
    │    │    │    │
    │    │    │    ├── 01_Action-Conditioned_3D_Human_Motion_Synthesis_with_Transformer_VAE_mathpix.txt
    │    │    │    │
    │    │    │    └── 01_Action-Conditioned_3D_Human_Motion_Synthesis_with_Transformer_VAE.pdf
    │    │    │
    │    │    └── 03_Generative_Adversarial_Graph_Convolutional_Networks_for_Human_Action_Synthesis/
    │    │         │
    │    │         ├── 03_Generative_Adversarial_Graph_Convolutional_Networks_for_Human_Action_Synthesis_mathpix.txt
    │    │         │
    │    │         └── 03_Generative_Adversarial_Graph_Convolutional_Networks_for_Human_Action_Synthesis.pdf
    │    │
    │    └ README.md

2. Local Test

Due to modifications to the Dockerfile, it is necessary to restart the container once.

  1. Remove the existing containers
# Move to the directory that has `docker-compose.yaml`
~/Crux$ cd environments/cpu

# Remove the existing docker containers
~/Crux/environments/cpu$ docker compose down
  1. Boot up containers without using cache
# Re-build docker images without using cache
~/Crux/environments/cpu$ docker compose build --no-cache

# Boot up docker containers
~/Crux/environments/cpu$ docker compose up -d

All the steps below should be processed inside the container

  1. Update Python libraries inside the container
# Update the poetry.lock and install the updated python libraries
~/crux-backend$ poetry update
  1. Run the backend server
~/crux-backend$ poetry run make run-backend
poetry run gunicorn 'src.main:app' -k uvicorn.workers.UvicornWorker -b 0.0.0.0:8000 -t 6000 --log-level info
[2024-04-01 02:51:06 +0000] [39523] [INFO] Starting gunicorn 21.2.0
[2024-04-01 02:51:06 +0000] [39523] [INFO] Listening at: http://0.0.0.0:8000 (39523)
[2024-04-01 02:51:06 +0000] [39523] [INFO] Using worker: uvicorn.workers.UvicornWorker
[2024-04-01 02:51:06 +0000] [39539] [INFO] Booting worker with pid: 39539
[2024-04-01 02:51:06 +0000] [39539] [INFO] Started server process [39539]
[2024-04-01 02:51:06 +0000] [39539] [INFO] Waiting for application startup.
[2024-04-01 02:51:06 +0000] [39539] [INFO] Application startup complete.
  1. Check if the RAG-based chat logic works properly on http://localhost:8000/docs

    path parameters (user_id and conversation_id) are to be used to treat each user's conversation as different and to provide different chat history image

    If you get the responses below, the logic works correctly. image

3. Test on AWS

I deployed it as a serverless application to AWS Lambda. You can verify its operation by sending a request to the Lambda function's URL. Calling the lambda from a function URL requires the signature with AWS Signature Version 4, but the signature generation process is complicated and takes much time. Thus, using tools like Postman is recommended link

The command below is one example of calling a lambda function with the basic curl command. link *I DID NOT check the below command works without problems as I used Postman.

curl --location '<LAMBDA_FUNCTION_URL>/chat/1/1' \
--header 'Content-Type: application/json' \
--aws-sigv4 "aws:amz:<REGION>:lambda" \
--user "<AWS_ACCESS_KEY_ID>:<AWS_SECRET_ACCESS_KEY>" \
--data '{
  "query": "motion synthesisとはどのようなタスクを指すのでしょうか?",
  "qdrant_type": "cloud",
  "openai_api_key": "<OPENAI_API_KEY>",
  "qdrant_cloud_url": "<QDRANT_CLOUD_URL>",
  "qdrant_api_key": "<QDRANT_API_KEY>"
}'

The following process summarizes the steps to deploy a serverless application using AWS Lambda functions. Although it's possible to test using the above command, I've briefly summarized it here for sharing and recording purposes.

  1. Install AWS CLI on the host PC. link *all commands on the following step should be executed on your host PC, not inside the container

  2. Authenticate the Docker client against the Amazon ECR registry (issue a 12-hour valid token and perform docker login, it's necessary to execute this again after 12 hours).

~$ aws ecr get-login-password --region <REGION> | docker login --username AWS --password-stdin <AWS_ACCOUNT_ID>.dkr.ecr.<REGION>.amazonaws.com/<ECR_REPOSITORY_NAME>
  1. Build environments/deploy/Dockerfile.backend to create a Docker image.
~/Crux$ docker build -f environments/deploy/Dockerfile.backend -t <AWS_ACCOUNT_ID>.dkr.ecr.<REGION>.amazonaws.com/<ECR_REPOSITORY_NAME>:<ARBITRARY_TAG> .
  1. Upload the created image to Amazon ECR.
~/Crux$ docker push <AWS_ACCOUNT_ID>.dkr.ecr.<REGION>.amazonaws.com/<ECR_REPOSITORY_NAME>:<ARBITRARY_TAG>
  1. Create a function from the container image uploaded to Amazon ECR on the AWS Lambda page. link

  2. Change the timeout settings of the Lambda function and the memory of the resources to be provisioned as shown in the photo.

    スクリーンショット 2024-04-03 164241

  3. Generate the function URL for the created function. link *This time, since I wanted only authorized persons to access it, I set the Auth Type to AWS_IAM.

Note for reviewers

[!Note] The CD process using GitHub Actions has not been implemented yet, but there are plans to introduce it in the future.

[!Important] See the "Note for reviewers" section of #30

YoshikiKubotani commented 5 months ago

I conducted an additional experiment comparing the response with and w/o the chat history. (llm model: gpt-3.5-turbo-0125)

Without the chat history

query: "motion synthesisとはどのようなタスクを指すのでしょうか?"
response: "モーション合成(motion synthesis)は、人間の身体骨格の理解可能な空間的および時間的な運動の生成を指します。現在の手法では、手動で構造化されたシーケンスまたはグラフベースの構造から構造的および動的なパターンを抽出します。これらの表現は、時系列の依存関係を学習するために自己回帰的なモデル、または確率分布を学習するために生成モデルを介してアクションを合成するために使用されます。"

query: "では他にはどのような手法がありますか?"
response: "他にも、自己教師付き学習や強化学習などがあります。自己教師付き学習では、モデルが自ら生成したデータを使って学習する方法です。一方、強化学習では、報酬を最大化するような行動を学習するために環境とのやり取りを通じて学習します。これらの手法も動作生成やサンプル品質の向上に役立つことがあります。"

query: "ありがとうございます。あなたの今までの解答をまとめていただけますか?"
response: "もちろんです。提供された情報に基づいて以下のようにまとめることができます。\n\n1. 追加実験(セクションA):追加の実験を行い、その結果を報告しています。\n2. 追加の質的結果(セクションB):追加の質的結果を提供しています。\n3. 実装の詳細(セクションC):実装の詳細について説明しています。\n\n何か他に質問があればお知らせください。"

With the chat history

query: "motion synthesisとはどのようなタスクを指すのでしょうか?"
response: "モーション合成(motion synthesis)は、人間の身体の骨格の空間的および時間的な運動の生成を指します。現在の方法では、手動で構造化されたシーケンスまたはグラフベースの構造から構造的および動的なパターンを抽出し、それらの表現を用いてアクションを合成します。これは、自己回帰(時系列の依存関係を学習するため)または生成モデル(確率分布を学習するため)を用いて行われます。"

query: "では他にはどのような手法がありますか?"
response: "モーション合成にはさまざまな手法があります。自己回帰型の手法としては、LSTMやGRUなどのリカレントニューラルネットワークが広く使用されています。また、生成モデルには、Generative Adversarial Networks(GANs)やVariational Autoencoders(VAEs)などが利用されることもあります。他にも、トランスフォーマーモデルやニューラル放射輝度場(neural radiance fields)などの新しい手法も取り入れられることがあります。これらの手法は、モーション合成における様々な課題に対処するために使用されます。"

query: "ありがとうございます。あなたの今までの解答をまとめていただけますか?"
response: "モーション合成(motion synthesis)は、人間の身体の骨格の空間的および時間的な運動の生成を指します。現在の方法では、手動で構造化されたシーケンスまたはグラフベースの構造から構造的および動的なパターンを抽出し、それらの表現を用いてアクションを合成します。これは、自己回帰(時系列の依存関係を学習するため)または生成モデル(確率分布を学習するため)を用いて行われます。モーション合成には、リカレントニューラルネットワーク(LSTMやGRUなど)、Generative Adversarial Networks(GANs)、Variational Autoencoders(VAEs)、トランスフォーマーモデル、ニューラル放射輝度場(neural radiance fields)などの手法が使用され、様々な課題に対処するために活用されます。
gatheluck commented 4 months ago

I faced following error during local test. Cloud you double check if the file data/templates/question_making_template.txt is needed or not?

error detail > [2024-04-13 06:42:41 +0000] [58] [ERROR] Exception in ASGI application Traceback (most recent call last): File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi result = await app( # type: ignore[func-returns-value] File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in __call__ return await self.app(scope, receive, send) File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__ await super().__call__(scope, receive, send) File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/applications.py", line 123, in __call__ await self.middleware_stack(scope, receive, send) File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in __call__ raise exc File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in __call__ await self.app(scope, receive, _send) File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/middleware/cors.py", line 91, in __call__ await self.simple_response(scope, receive, send, request_headers=headers) File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/middleware/cors.py", line 146, in simple_response await self.app(scope, receive, send) File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__ await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/routing.py", line 758, in __call__ await self.middleware_stack(scope, receive, send) File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/routing.py", line 778, in app await route.handle(scope, receive, send) File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/routing.py", line 299, in handle await self.app(scope, receive, send) File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/routing.py", line 79, in app await wrap_app_handling_exceptions(app, request)(scope, receive, send) File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app raise exc File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/starlette/routing.py", line 74, in app response = await func(request) File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/fastapi/routing.py", line 278, in app raw_response = await run_endpoint_function( File "/home/challenger/crux-backend/.venv/lib/python3.10/site-packages/fastapi/routing.py", line 191, in run_endpoint_function return await dependant.call(**values) File "/home/challenger/crux-backend/src/routers.py", line 47, in chat response = await chat_service.generate_response_with_rag_based_chat_service( File "/home/challenger/crux-backend/src/domain/services/chat_service.py", line 125, in generate_response_with_rag_based_chat_service self._render_question_making_prompt(self.template_dir) File "/home/challenger/crux-backend/src/domain/services/chat_service.py", line 292, in _render_question_making_prompt with open(template_dir / "question_making_template.txt", "r") as f: FileNotFoundError: [Errno 2] No such file or directory: 'data/templates/question_making_template.txt'