k2-fsa / sherpa

Speech-to-text server framework with next-gen Kaldi
https://k2-fsa.github.io/sherpa
Apache License 2.0
473 stars 97 forks source link

Whisper model cannot build with tensorrt llm #584

Open billlyzhaoyh opened 2 months ago

billlyzhaoyh commented 2 months ago

Been following the instruction at https://github.com/k2-fsa/sherpa/tree/master/triton/whisper and somehow running the line python3 build.py --output_dir whisper_large_v3 --use_gpt_attention_plugin --use_gemm_plugin --use_bert_attention_plugin --enable_context_fmha results in error:

TypeError: DecoderModel.__init__() missing 7 required positional arguments: 'num_heads', 'hidden_size', 'ffn_hidden_size', 'encoder_num_heads', 'encoder_hidden_size', 'vocab_size', and 'dtype'

saad946 commented 1 month ago

Any update on this? we facing the similar issue

yuekaizhang commented 1 month ago

@billlyzhaoyh @saad946

image Would you mind trying to use the pre-build image rather than build from scratch? I would fix the dockefile and update here. Then you could build image by yourself.