cuda-runtime-api Search Results

1000+ results
for cuda-runtime-api

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/TensorRT-LLM #1575

Error occured when running medusa inference.

Hi, when i use medusa decoding on trtllm-090 which profiling, error occrued as follows. Could you please help to have a look? Thanks! If i do not use `--run_profiling`, the inference process is nor…

littletomatodonkey updated 18 hours ago
3
microsoft/onnxruntime #21635

[Performance]

### Describe the issue We have converted the translation LLM 7B model to ONNX format using Optimum Hugging Face and then quantized it to 8-bit quantization with Dynamic quantization technique. Ho…

chakka12345677 updated 3 months ago
1
pytorch/pytorch #117546

interpolate::trilinear has different result between 1.8.0 an…

### 🐛 Describe the bug ```python import numpy as np import torch # use numpy to generate data input = torch.from_numpy(np.random.uniform(np.finfo(np.float32).min, np.finfo(np.float32).max, (2…

justiceeem updated 3 weeks ago
1
lix19937/tensorrt-insight #27

Floating point computing capacity not match with Orin-x's d…

1. Please describe the issue: Floating point computing capacity not match with Orin-x's datasheet 2. Detailed steps on how to reproduce the issue: Run cuda sample `cudaTensorCoreGemm` ``` Initi…

lix19937 updated 4 months ago
1
vllm-project/vllm #9283

[Bug]: Simultaneous mm calls lead to permanently degraded pe…

### Your current environment The output of `python collect_env.py` ```text Collecting environment information... PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTor…

SeanIsYoung updated 3 weeks ago
19
andersbll/cudarray #58

Possible source of <cuda_runtime_api.h> error on Windows

I am getting the following error on compilation: ``` In file included from src/nnet/conv_bc01_matmul.cpp:1:0: ./include/cudarray/common.hpp:8:30: fatal error: cuda_runtime_api.h: No such file or dire…

N2ITN updated 7 years ago
6
vllm-project/vllm #9253

[Bug]: new beam search implementation ignores stop condition…

### Your current environment The output of `python collect_env.py` ```text Collecting environment information... /usr/local/lib/python3.10/dist-packages/vllm/connections.py:8: RuntimeWarning…

nFunctor updated 1 month ago
1
Tencent/HunyuanDiT #65

TRT自行构建engine出错

环境： H100 基础镜像： docker pull pytorch/pytorch:2.3.0-cuda11.8-cudnn8-devel python3.10 步骤：按照步骤https://hf-mirror.com/Tencent-Hunyuan/TensorRT-libs/blob/main/README_zh.md提示进行报错： ![image](https://gi…

flysssss updated 4 months ago
4
vllm-project/vllm #8242

[Bug]: GPU Memory Utilization Lower Than Expected with --ena…

### Your current environment The output of `python collect_env.py` ```text PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A…

hxer7963 updated 2 months ago
5
vllm-project/vllm #10102

[Bug]: Engine loop has died for Meta-Llama-3.1-8B-Instruct T…

### Your current environment The output of `python collect_env.py` ```text PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N…

HaoyuWang4188 updated 23 hours ago
11

上一页 1...16 17 18 19 20 21 22...100 下一页

1000+ results for cuda-runtime-api

1000+ results
for cuda-runtime-api