-
### System Info
- TensorRT-LLM v0.8.0 (pinned to release commit)
- Nvidia A100
- Mistral-7B-Instruct-v0.2
- Using the CPP runner
- Installed with `pip install tensorrt_llm==0.8.0 --extra-index-ur…
iibw updated
2 months ago
-
As followed README to build trtllm, i met an issue as below, please help me check it. Thank you!
triton/whisper/README.md
Seems like process being killed unexpectedly during converting encoder che…
-
I wanted to test WhisperFusions on rtx3090.
Went throught build.sh and then:
`docker run --gpus all --shm-size 64G -p 6006:6006 -p 8888:8888 -it ghcr.io/collabora/whisperfusion:latest
`
and I ru…
rvsh2 updated
4 months ago
-
I installed the project on Colab using the following commands:
```
!git clone https://github.com/MahmoudAshraf97/whisper-diarization.git
!sudo apt update && sudo apt install cython3
!sudo apt upda…
-
Could you please explain how to add a Hugging Face pretrained model to work with WhisperLive?
-
这是bash launch_server.sh启动后的全部日志信息,看着没有什么问题
```
I0409 02:55:34.488607 25157 pinned_memory_manager.cc:275] Pinned memory pool is created at '0x7f8b7c000000' with size 2048000000
I0409 02:55:34.491989…
-
I run the MSDD model on Nvidia A10 (24GB), but the inference is too slow, I looked on the code and there is a lot of traffic between the CPU and GPU and vice versa.
most of the time GPU utilization…
-
Hi All,
We try to run `resent18` model faster than just running the torchvision version on GPU, therefore we planned to convert and quantize the model using TensorRT. However, we did not witness a pe…
-
### Expected Behavior
有两处缺少了对应的图标
### Actual Behavior
![QQ20241024-092053](https://github.com/user-attachments/assets/0cdc5043-a659-4589-a592-ff2558b3e265)
![QQ20241024-092041](https://github.com/…
-
## Overview
In the future, we want to support **multiple ML backends** for each _endpoint_
Example:
`chat/completions` can use:
- llama.cpp
- candle
- tensorrt
`audio/transcriptions` can us…