shashikg / WhisperS2T

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
MIT License
318 stars 32 forks source link

Can't find decoder_config.json when using tensorrt-llm large-v3 model #70

Closed Nyralei closed 3 months ago

Nyralei commented 3 months ago

I try to run:

import whisper_s2t

model = whisper_s2t.load_model(model_identifier="large-v3", backend='TensorRT-LLM', asr_options={'word_timestamps': True})

files = ['1.wav']
lang_codes = ['ru']
tasks = ['transcribe']
initial_prompts = [None]

out = model.transcribe_with_vad(files,
                                lang_codes=lang_codes, # pass lang_codes for each file
                                tasks=tasks, # pass transcribe/translate 
                                initial_prompts=initial_prompts, # to do prompting (currently only supported for CTranslate2 backend)
                                batch_size=48)

whisper_s2t.write_outputs(out, format='json', ip_files=files, save_dir="./save_dir")

Then I get an error:

root@0ef49e9f5255:/audiofiles# python3 transcribe.py
[TensorRT-LLM] TensorRT-LLM version: 0.8.0.dev2024012301'trt_build_args' not provided in model_kwargs, using default configs.
100%|█████████████████████████████████████| 2.37M/2.37M [00:01<00:00, 2.07MiB/s]
100%|█████████████████████████████████████| 2.88G/2.88G [10:18<00:00, 4.99MiB/s]
⠹ Exporting Model To TensorRT Engine (3-6 mins) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0:02:25Killed
⠸ Exporting Model To TensorRT Engine (3-6 mins) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0:02:25
Traceback (most recent call last):
  File "/audiofiles/transcribe.py", line 3, in <module>
    model = whisper_s2t.load_model(model_identifier="large-v3", backend='TensorRT-LLM', asr_options={'word_timestamps': True})
  File "/usr/local/lib/python3.10/dist-packages/whisper_s2t/__init__.py", line 44, in load_model
    return WhisperModel(model_identifier, **model_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/whisper_s2t/backends/tensorrt/model.py", line 108, in __init__
    self.model = WhisperTRT(self.model_path)
  File "/usr/local/lib/python3.10/dist-packages/whisper_s2t/backends/tensorrt/trt_model.py", line 169, in __init__
    self.decoder = WhisperDecoding(engine_dir, runtime_mapping)
  File "/usr/local/lib/python3.10/dist-packages/whisper_s2t/backends/tensorrt/trt_model.py", line 77, in __init__
    self.decoder_config = self.get_config(engine_dir)
  File "/usr/local/lib/python3.10/dist-packages/whisper_s2t/backends/tensorrt/trt_model.py", line 83, in get_config
    with open(config_path, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/whisper_s2t/models/trt/large-v3/4944040cf145a1cd25ae346958b40fc9/decoder_config.json'

Directory contents:

root@0ef49e9f5255:/audiofiles# ls -lah /root/.cache/whisper_s2t/models/trt/large-v3/4944040cf145a1cd25ae346958b40fc9/
total 1.2G
drwxr-xr-x 1 root root 4.0K Aug  9 13:31 .
drwxr-xr-x 1 root root 4.0K Aug  9 13:30 ..
-rw-r--r-- 1 root root 1.2G Aug  9 13:31 encoder.engine
-rw-r--r-- 1 root root 1.3K Aug  9 13:31 encoder_config.json
-rw-r--r-- 1 root root 2.4M Aug  9 13:30 tokenizer.json
-rw-r--r-- 1 root root  726 Aug  9 13:30 trt_build_args.json

Tried with large-v2, but get the same thing

Nyralei commented 3 months ago

Nvm, process gets oomkilled by Docker Desktop