NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
7.34k stars 794 forks source link

MMLU script raise TypeError: list indices must be integers or slices, not tuple #1822

Open DefTruth opened 1 week ago

DefTruth commented 1 week ago

System Info

L20x8

Who can help?

@byshiue

Information

Tasks

Reproduction

cd /app/tensorrt_llm/examples
mkdir data; wget https://people.eecs.berkeley.edu/~hendrycks/data.tar -O data/mmlu.tar
tar -xf data/mmlu.tar -C data && mv data/data data/mmlu

mpirun --allow-run-as-root -n 8 python3 mmlu.py \
                --hf_model_dir $HF_MODELS/Qwen1.5-72B-Chat \
                --engine_dir $HF_MODELS/engine/Qwen1.5-72B-Chat/fp16/8-gpu/ \
                --data_dir "./data/mmlu" --test_trt_llm

mpirun --allow-run-as-root -n 8 python3 mmlu.py \
                --hf_model_dir $HF_MODELS/Qwen1.5-72B-Chat \
                --engine_dir $HF_MODELS/engine/Qwen1.5-72B-Chat/fp16/8-gpu/ \
                --data_dir "./data/mmlu" --test_hf

Expected behavior

no error

actual behavior

Traceback (most recent call last):
  File "/app/tensorrt_llm/examples/mmlu.py", line 427, in <module>
    main()
  File "/app/tensorrt_llm/examples/mmlu.py", line 402, in main
    cors, acc, probs = evaluate(args, subject, pipeline, dev_df, test_df)
  File "/app/tensorrt_llm/examples/mmlu.py", line 214, in evaluate
    pred = pipeline(prompt)
  File "/app/tensorrt_llm/examples/mmlu.py", line 299, in __call__
    output_ids = outputs[0, 0, input_lengths[0]:]
TypeError: list indices must be integers or slices, not tuple
  0%|          | 0/57 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/app/tensorrt_llm/examples/mmlu.py", line 427, in <module>
    main()
  File "/app/tensorrt_llm/examples/mmlu.py", line 402, in main
    cors, acc, probs = evaluate(args, subject, pipeline, dev_df, test_df)
  File "/app/tensorrt_llm/examples/mmlu.py", line 214, in evaluate
    pred = pipeline(prompt)
  File "/app/tensorrt_llm/examples/mmlu.py", line 299, in __call__
    output_ids = outputs[0, 0, input_lengths[0]:]
TypeError: list indices must be integers or slices, not tuple

additional notes

[TensorRT-LLM] TensorRT-LLM version: 0.11.0.dev2024061100

DefTruth commented 1 week ago

发现是一开始有结果返回的类型是list不是Tensor,导致结果取索引错误

<class 'list'>
input_lengths[0]: 343
<class 'list'>
TypeError: list indices must be integers or slices, not tuple
  0%|          | 0/57 [00:00<?, ?it/s]
  0%|          | 0/57 [00:00<?, ?it/s]
input_lengths[0]: 343
<class 'list'>
  0%|          | 0/57 [00:00<?, ?it/s]
input_lengths[0]: 343
<class 'list'>
# 然后报错 ....

# 当结果是Tensor时,不会报错
input_lengths[0]: 343
<class 'torch.Tensor'>
input_lengths[0]: 358
<class 'torch.Tensor'>
input_lengths[0]: 362
<class 'torch.Tensor'>
input_lengths[0]: 372
<class 'torch.Tensor'>
input_lengths[0]: 385
<class 'torch.Tensor'>
input_lengths[0]: 385
<class 'torch.Tensor'>
input_lengths[0]: 373
<class 'torch.Tensor'>

有时候trtllm会返回空列表

input_lengths[0]: 577
<class 'list'>
[]

多卡情况,非rank=0,ModelRunnerCpp直接返回[],导致了这个错误

 # If we are in a multi-gpu scenario, only rank 0 continues
        if not self.session.can_enqueue_requests():
            return []
DefTruth commented 1 week ago

改成用ModelRunner跑不会报这个错,但是慢很多

DefTruth commented 6 days ago

另外,跑HF MMLU的时候(--test_hf),有新的报错:

Traceback (most recent call last):
  File "/app/tensorrt_llm/examples/mmlu.py", line 427, in <module>
    main()
  File "/app/tensorrt_llm/examples/mmlu.py", line 382, in main
    torch_dtype=DTYPE_STR_MAPPING[args.data_type],
AttributeError: 'Namespace' object has no attribute 'data_type'. Did you mean: 'hf_data_type'?

需要修改成args.hf_data_type

hijkzzz commented 6 days ago

This is a bug.

DefTruth commented 5 days ago

This is a bug.

any plan to fix it?