-
Hi, when i use medusa decoding on trtllm-090 which profiling, error occrued as follows. Could you please help to have a look? Thanks!
If i do not use `--run_profiling`, the inference process is nor…
-
### Describe the issue
We have converted the translation LLM 7B model to ONNX format using Optimum Hugging Face and then quantized it to 8-bit quantization with Dynamic quantization technique. Ho…
-
### 🐛 Describe the bug
```python
import numpy as np
import torch
# use numpy to generate data
input = torch.from_numpy(np.random.uniform(np.finfo(np.float32).min, np.finfo(np.float32).max, (2…
-
1. Please describe the issue:
Floating point computing capacity not match with Orin-x's datasheet
2. Detailed steps on how to reproduce the issue:
Run cuda sample `cudaTensorCoreGemm`
```
Initi…
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTor…
-
I am getting the following error on compilation:
```
In file included from src/nnet/conv_bc01_matmul.cpp:1:0:
./include/cudarray/common.hpp:8:30: fatal error: cuda_runtime_api.h: No such file or dire…
N2ITN updated
7 years ago
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
/usr/local/lib/python3.10/dist-packages/vllm/connections.py:8: RuntimeWarning…
-
环境:
H100
基础镜像:
docker pull pytorch/pytorch:2.3.0-cuda11.8-cudnn8-devel
python3.10
步骤:
按照步骤https://hf-mirror.com/Tencent-Hunyuan/TensorRT-libs/blob/main/README_zh.md提示进行
报错:
![image](https://gi…
-
### Your current environment
The output of `python collect_env.py`
```text
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A…
-
### Your current environment
The output of `python collect_env.py`
```text
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N…