Marker-Inc-Korea / AutoRAG

RAG AutoML Tool - Find optimal RAG pipeline for your own data.
Apache License 2.0
1.12k stars 97 forks source link

vLLM terminated unexpectedly #512

Open gnekt opened 2 weeks ago

gnekt commented 2 weeks ago

Hello,

While using the ELI5 and TriviaQA datasets from the Hugging Face library, I encountered errors related to missing documents that are not present in the corpus. I experienced a similar issue with the HotpotQA dataset but managed to resolve it by cleaning the mismatched documents.

However, when I switched to using the HotpotQA dataset, I observed some strange behavior with the VLLM module. Specifically, the process is terminated without any error messages.

Thank you in advance.

vkehfdl1 commented 2 weeks ago

Hello @gnekt First of all, Thanks for the report. We'll check Eli5 and triviaQA dataset first. @bwook00 @Eastsidegunn will help this also.

And about the termination of vLLM module. It can be OOM error. Do you check your VRAM status? And what file did you use, and what system you used for running each dataset?

CristianCosci commented 2 weeks ago

And what file did you use, and what system you used for running each dataset?

CPU: AMD EPYC 7402P (48) @ 2.800GHz GPU0: NVIDIA GeForce RTX 3090 GPU1: NVIDIA GeForce RTX 3090 Memory: 3042MiB / 128667MiB

And about the termination of vLLM module. It can be OOM error. Do you check your VRAM status?

This is our config.yaml:

# This config YAML file does not contain any optimization.
node_lines:
- node_line_name: retrieve_node_line  # Arbitrary node line name
  nodes:
    - node_type: retrieval
      strategy:
        metrics: [retrieval_f1, retrieval_recall, retrieval_precision]
      top_k: 3
      modules:
        - module_type: vectordb
          embedding_model: huggingface_baai_bge_small
- node_line_name: post_retrieve_node_line  # Arbitrary node line name
  nodes:
    - node_type: prompt_maker
      strategy:
        metrics: [bleu, meteor, rouge]
      modules:
        - module_type: fstring
          prompt: "Read the passages and answer the given question. \n Question: {query} \n Passage: {retrieved_contents} \n Answer : "
    - node_type: generator
      strategy:
        metrics: [bleu, meteor, rouge]
      modules:
        - module_type: vllm
          llm: mistralai/Mistral-7B-Instruct-v0.2

We checked the VRAM status and it takes only 1GB for the embedding computation (it seems not possible to occur in OOM error).

CristianCosci commented 2 weeks ago

This is the last logger output when executing autorag:

UserWarning: This pandas object has duplicate indices, and swifter may not be able to improve performance. Consider resetting the indices with `df.reset_index(drop=True)`.
  warnings.warn(
[06/20/24 08:39:54] INFO     [evaluator.py:97] >> Running node line post_retrieve_node_line...                                    evaluator.py:97
                    INFO     [node.py:55] >> Running node prompt_maker...                                                              node.py:55
                    INFO     [base.py:20] >> Running prompt maker node - fstring module...                                             base.py:20
                    INFO     [node.py:55] >> Running node generator...                                                                 node.py:55
                    INFO     [base.py:34] >> Running generator node - vllm module...                                                   base.py:34
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
    - Avoid using `tokenizers` before the fork if possible
    - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
    - Avoid using `tokenizers` before the fork if possible
    - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
vkehfdl1 commented 2 weeks ago

Hi @gnekt I checked Eli5 and triviaQA dataset and the dataset had no missing doc_id in the corpus dataset. Unfortunately, I can't find the error you mentioned.

@gnekt @CristianCosci Plus, about the vLLM termination, actually I can't find this error also. I think it is because the vLLM install environment? (It worked fine on our system)

Maybe re-install vLLM to the latest version can be helpful.... My vllm version is 0.4.3