Wrong output when input is packed in Whisper with C++ runtime

sasikr2 commented 2 months ago

System Info

CPU Architecture: x86_64 GPU: NVIDIA A100-SXM4-40GB

TensorRT-LLM version: 0.14.0.dev2024091700

Who can help?

No response

Information

[x] The official example scripts
[x] My own modified scripts

Tasks

[x] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[x] My own task or dataset (give details below)

Reproduction

Steps to reproduce:

Build encoder and decoder in same command which is mentioned in repo using trtllm-build trtllm-build --checkpoint_dir /stream_whisper/latest_build_dir/models/trtllm_checkpoint_v12/encoder \ --output_dir /stream_whisper/latest_build_dir/models/whisper_large_v3/encoder \ --input_timing_cache /stream_whisper/latest_build_dir/encoder_whisper.cache \ --moe_plugin disable \ --enable_xqa disable \ --max_batch_size 4 \ --gemm_plugin disable \ --bert_attention_plugin float16 \ --max_input_len 3000 --max_seq_len=3000
Script to run test test_run.txt Audio Sample: 12 sec english file
Padded input: change line number 509, i.e mels, mels_input_len = prepare_inputs(files, input_type="padded")
Packed input: change line number 509, i.e mels, mels_input_len = prepare_inputs(files, input_type="packed")

Expected behavior

Expected output should be: Output: ['So basically what I observed is that word error rate are very high for Chinese language but character error rate seems to be good. Higher amplitude the WR is degrading and']

actual behavior

When passing packed audio segments, output comes to be empty. while it should matched with padded input.

additional notes

Can you check once script, the way of sending packed input. OR it is some issue in c++ binding.

yuekaizhang commented 1 month ago

@sasikr2 Would you mind trying again with Today's (10/15/2024) commit? There are updates under whisper/readme.md about different padding startegy. However, for offcial whisper, you can't remove 30s padding otherwise you would lose accuracy.

yuekaizhang commented 1 month ago

See https://pypi.org/project/tensorrt-llm/0.15.0.dev2024101500/.

sasikr2 commented 1 month ago

Okay, I will try today with updated code.

NVIDIA / TensorRT-LLM