Closed matuszelenak closed 1 month ago
same..
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.107.02 Driver Version: 550.107.02 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 On | 00000000:82:00.0 Off | Off |
| 30% 25C P8 8W / 450W | 1MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
https://github.com/collabora/WhisperLive/pull/276 should resolve this.
276 should resolve this.
can you please update ghcr.io/collabora/whisperlive-tensorrt:latest? because there are still the same old problems there
276 should resolve this.
Unfortunately, does not seem like it.
(base) whiskas@debian-gpu:~$ docker run -p 9090:9090 --runtime=nvidia --gpus all --entrypoint /bin/bash -it whisper-live-trt:latest
root@7f39b90ea7e6:/app# nvidia-smi
Thu Sep 19 11:45:35 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3090 On | 00000000:01:00.0 Off | N/A |
| 0% 36C P8 12W / 420W | 4MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
root@7f39b90ea7e6:/app# bash build_whisper_tensorrt.sh /app/TensorRT-LLM-examples small.en
Looking in indexes: https://pypi.org/simple, https://pypi.nvidia.com
Requirement already satisfied: tensorrt_llm==0.10.0 in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 2)) (0.10.0)
Requirement already satisfied: tiktoken in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 3)) (0.3.3)
Requirement already satisfied: datasets in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 4)) (3.0.0)
Requirement already satisfied: kaldialign in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 5)) (0.9.1)
Requirement already satisfied: openai-whisper in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 6)) (20231117)
Collecting librosa
Downloading librosa-0.10.2.post1-py3-none-any.whl (260 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 260.1/260.1 KB 1.9 MB/s eta 0:00:00
Requirement already satisfied: soundfile in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 8)) (0.12.1)
Requirement already satisfied: safetensors in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 9)) (0.4.5)
Requirement already satisfied: transformers in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 10)) (4.40.2)
Requirement already satisfied: janus in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 11)) (1.0.0)
Installing collected packages: librosa
Successfully installed librosa-0.10.2.post1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Downloading small.en...
--2024-09-19 11:46:00-- https://openaipublic.azureedge.net/main/whisper/models/f953ad0fd29cacd07d5a9eda5624af0f6bcf2258be67c92b79389873d91e0872/small.en.pt
Resolving openaipublic.azureedge.net (openaipublic.azureedge.net)... 13.107.253.67, 2620:1ec:29:1::67
Connecting to openaipublic.azureedge.net (openaipublic.azureedge.net)|13.107.253.67|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 483615683 (461M) [application/octet-stream]
Saving to: 'assets/small.en.pt'
small.en.pt 100%[=======================================================================================================>] 461.21M 16.6MB/s in 20s
2024-09-19 11:46:20 (23.0 MB/s) - 'assets/small.en.pt' saved [483615683/483615683]
Download completed: small.en.pt
whisper_small_en
Running build script for small.en with output directory whisper_small_en
[TensorRT-LLM] TensorRT-LLM version: 0.10.0
[09/19/2024-11:46:22] [TRT-LLM] [I] plugin_arg is None, setting it as float16 automatically.
[09/19/2024-11:46:22] [TRT-LLM] [I] plugin_arg is None, setting it as float16 automatically.
[09/19/2024-11:46:22] [TRT-LLM] [I] plugin_arg is None, setting it as float16 automatically.
[09/19/2024-11:46:23] [TRT] [I] [MemUsageChange] Init CUDA: CPU +14, GPU +0, now: CPU 594, GPU 263 (MiB)
[09/19/2024-11:46:24] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +2132, GPU +396, now: CPU 2882, GPU 659 (MiB)
[09/19/2024-11:46:24] [TRT] [W] profileSharing0806 is on by default in TensorRT 10.0. This flag is deprecated and has no effect.
...
[09/19/2024-11:46:53] [TRT] [I] Total Weights Memory: 386860032 bytes
[09/19/2024-11:46:53] [TRT] [I] Compiler backend is used during engine execution.
[09/19/2024-11:46:53] [TRT] [I] Engine generation completed in 7.84096 seconds.
[09/19/2024-11:46:53] [TRT] [I] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 153 MiB, GPU 1126 MiB
[09/19/2024-11:46:53] [TRT] [I] [MemUsageStats] Peak memory usage during Engine building and serialization: CPU: 5786 MiB
[09/19/2024-11:46:53] [TRT-LLM] [I] Total time of building Unnamed Network 0: 00:00:07
[09/19/2024-11:46:53] [TRT-LLM] [I] Config saved to whisper_small_en/decoder_config.json.
[09/19/2024-11:46:53] [TRT-LLM] [I] Serializing engine to whisper_small_en/whisper_decoder_float16_tp1_rank0.engine...
[09/19/2024-11:46:53] [TRT-LLM] [I] Engine serialized. Total time: 00:00:00
Whisper small.en TensorRT engine built.
=========================================
Model is located at: /app/TensorRT-LLM-examples/whisper/whisper_small_en
root@7f39b90ea7e6:/app# python3 run_server.py --port 9090 \
--backend tensorrt \
--trt_model_path "/app/TensorRT-LLM-examples/whisper/whisper_small_en"
[TensorRT-LLM] TensorRT-LLM version: 0.10.0
--2024-09-19 11:47:44-- https://github.com/snakers4/silero-vad/raw/v4.0/files/silero_vad.onnx
Resolving github.com (github.com)... 140.82.121.3
Connecting to github.com (github.com)|140.82.121.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/snakers4/silero-vad/v4.0/files/silero_vad.onnx [following]
--2024-09-19 11:47:44-- https://raw.githubusercontent.com/snakers4/silero-vad/v4.0/files/silero_vad.onnx
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.108.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1807522 (1.7M) [application/octet-stream]
Saving to: ‘/root/.cache/whisper-live/silero_vad.onnx’
/root/.cache/whisper-live/silero_vad.onnx 100%[=======================================================================================================>] 1.72M --.-KB/s in 0.09s
2024-09-19 11:47:45 (19.3 MB/s) - ‘/root/.cache/whisper-live/silero_vad.onnx’ saved [1807522/1807522]
[7f39b90ea7e6:00362] *** Process received signal ***
[7f39b90ea7e6:00362] Signal: Segmentation fault (11)
[7f39b90ea7e6:00362] Signal code: Address not mapped (1)
[7f39b90ea7e6:00362] Failing at address: 0x18
[7f39b90ea7e6:00362] [ 0] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f27331b6520]
[7f39b90ea7e6:00362] [ 1] /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libth_common.so(_ZN12tensorrt_llm4thop14TorchAllocator6mallocEmb+0x88)[0x7f251c570d58]
[7f39b90ea7e6:00362] [ 2] /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libtensorrt_llm.so(_ZN12tensorrt_llm6layers18DynamicDecodeLayerI6__halfE14allocateBufferEv+0xd4)[0x7f253570f434]
[7f39b90ea7e6:00362] [ 3] /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libtensorrt_llm.so(_ZN12tensorrt_llm6layers18DynamicDecodeLayerI6__halfE10initializeEv+0x128)[0x7f2535713108]
[7f39b90ea7e6:00362] [ 4] /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libtensorrt_llm.so(_ZN12tensorrt_llm6layers18DynamicDecodeLayerI6__halfEC2ERKNS_7runtime12DecodingModeERKNS0_13DecoderDomainEP11CUstream_stSt10shared_ptrINS_6common10IAllocatorEE+0xb1)[0x7f2535713311]
[7f39b90ea7e6:00362] [ 5] /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libth_common.so(_ZN9torch_ext15FtDynamicDecodeI6__halfEC1Emmmmii+0x270)[0x7f251c550c70]
[7f39b90ea7e6:00362] [ 6] /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libth_common.so(_ZN9torch_ext15DynamicDecodeOp14createInstanceEv+0x8a)[0x7f251c5340ca]
[7f39b90ea7e6:00362] [ 7] /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libth_common.so(_ZN9torch_ext15DynamicDecodeOpC1EllllllN3c1010ScalarTypeE+0x84)[0x7f251c534214]
[7f39b90ea7e6:00362] [ 8] /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libth_common.so(_ZNSt17_Function_handlerIFvRSt6vectorIN3c106IValueESaIS2_EEEZN5torch6class_IN9torch_ext15DynamicDecodeOpEE12defineMethodIZNSB_3defIJllllllNS1_10ScalarTypeEEEERSB_NS7_6detail5typesIvJDpT_EEESsSt16initializer_listINS7_3argEEEUlNS1_14tagged_capsuleISA_EEllllllSE_E_EEPNS7_3jit8FunctionESsT_SsSN_EUlS5_E_E9_M_invokeERKSt9_Any_dataS5_+0xf8)[0x7f251c551058]
[7f39b90ea7e6:00362] [ 9] /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch_python.so(+0xa0f34e)[0x7f27312de34e]
[7f39b90ea7e6:00362] [10] /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch_python.so(+0xa0c8df)[0x7f27312db8df]
[7f39b90ea7e6:00362] [11] /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch_python.so(+0xa0e929)[0x7f27312dd929]
[7f39b90ea7e6:00362] [12] /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch_python.so(+0x47de04)[0x7f2730d4ce04]
[7f39b90ea7e6:00362] [13] python3(+0x15cb2e)[0x55af3c364b2e]
[7f39b90ea7e6:00362] [14] python3(_PyObject_MakeTpCall+0x25b)[0x55af3c35b2db]
[7f39b90ea7e6:00362] [15] python3(+0x16b6b0)[0x55af3c3736b0]
[7f39b90ea7e6:00362] [16] python3(+0x2826fb)[0x55af3c48a6fb]
[7f39b90ea7e6:00362] [17] python3(_PyObject_MakeTpCall+0x25b)[0x55af3c35b2db]
[7f39b90ea7e6:00362] [18] python3(_PyEval_EvalFrameDefault+0x6b17)[0x55af3c353d27]
[7f39b90ea7e6:00362] [19] python3(_PyFunction_Vectorcall+0x7c)[0x55af3c36542c]
[7f39b90ea7e6:00362] [20] python3(_PyObject_FastCallDictTstate+0x16d)[0x55af3c35a51d]
[7f39b90ea7e6:00362] [21] python3(+0x1674b4)[0x55af3c36f4b4]
[7f39b90ea7e6:00362] [22] python3(_PyObject_MakeTpCall+0x1fc)[0x55af3c35b27c]
[7f39b90ea7e6:00362] [23] python3(_PyEval_EvalFrameDefault+0x72ea)[0x55af3c3544fa]
[7f39b90ea7e6:00362] [24] python3(_PyFunction_Vectorcall+0x7c)[0x55af3c36542c]
[7f39b90ea7e6:00362] [25] python3(_PyEval_EvalFrameDefault+0x8ab)[0x55af3c34dabb]
[7f39b90ea7e6:00362] [26] python3(_PyFunction_Vectorcall+0x7c)[0x55af3c36542c]
[7f39b90ea7e6:00362] [27] python3(_PyObject_FastCallDictTstate+0x16d)[0x55af3c35a51d]
[7f39b90ea7e6:00362] [28] python3(+0x1674b4)[0x55af3c36f4b4]
[7f39b90ea7e6:00362] [29] python3(_PyObject_MakeTpCall+0x1fc)[0x55af3c35b27c]
[7f39b90ea7e6:00362] *** End of error message ***
Segmentation fault (core dumped)
Docker image updated on ghcr. Le us know if the issue still persists.
Docker image updated on ghcr. Le us know if the issue still persists.
it works! thank you :)
I'm trying to run the TensorRT version of the docker container according to instructions, but am getting a segfault whenever I attempt to transcribe any audio. The same audio works with the Faster whisper backend. This happens for both live transcription and submission of file
System info: Debian 12 VM with a RTX 3090 passthrough to it. Driver version 545.23.06
Full log: