cuda-kernels Search Results

1000+ results
for cuda-kernels

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

mys007/ecc #6

cuda_kernels help

Hey, I am working deeply on your code. I would like to ask you a favor, and if you could please help me to understand the cuda kernels. My email adress is thomasc@helix.re I have benchmark you…

tchaton updated 5 years ago
5
openxla/xla #16711

Build for GPU fails due to nccl error

I'm trying to build the XLA for GPU according to this guide: https://openxla.org/xla/developer_guide. Configuration goes just fine: ``` $ docker exec xla ./configure.py --backend=CUDA INFO:root:Try…

juuso-oskari updated 1 day ago
8
casper-hansen/AutoAWQ_kernels #16

cuda 12.2 AssertionError: AWQ kernels could not be loaded.

I am facing error, `AWQ kernels could not be loaded. `with autoawq==0.2.4. - image nvcr.io/nvidia/pytorch:23.10-py3 - Python 3.10 - cuda 12.2.2 - torch 2.1.0a0+32f93b1 Build and install f…

s-natsubori updated 1 month ago
2
NVIDIA/TensorRT-Model-Optimizer #67

Unable to load extension modelopt_cuda_ext and falling back …

I have tried to quantize a model by following the guide ([PyTorch Quantization — Model Optimizer 0.15.0](https://nvidia.github.io/TensorRT-Model-Optimizer/guides/_pytorch_quantization.html)), and I ca…

relaxtheo updated 2 days ago
2
CliMA/ClimaCore.jl #1343

Better names for CUDA kernels

**Is your feature request related to a problem? Please describe.** It's difficult to match CUDA kernel names in profiles with locations in the code: **Describe the solution you'd like** You c…

simonbyrne updated 7 months ago
1
pytorch/pytorch #134308

Segfault in Torch profiler when CUDA Graph Conditional Nodes…

### 🐛 Describe the bug This mostly follows NVIDIA's guide for conditional nodes from [here](https://developer.nvidia.com/blog/dynamic-control-flow-in-cuda-graphs-with-conditional-nodes/). It does a…

leijurv updated 1 week ago
6
vllm-project/vllm #5781

[Feature]: Use 64-bit integers as indices in cuda kernels

### 🚀 The feature, motivation and pitch I found that some kernels use 32-bit integers as indices, which can easily lead to overflow. I think change them into int64_t (or other 64bit types) will be sa…

courage17340 updated 2 months ago
1
torch-points3d/torch-points-kernels #101

Fail to built

OS：windows11 22H2 C++ Compiler： MSVC2022 Python：3.8 CudatoolKit：release 11.8, V11.8.89 pytorch：2.0.1 I am try to install torch points kernels 0.6.10 via pip then I got these ``` No CUDA ru…

Timo-AL updated 1 week ago
2
andy-yang-1/DoubleSparse #3

The method transfer the KV cache from cpu memory to gpu memo…

Wonderful work! Following Q and looking forward ur reply. 1) I am curious about the method in your paper that copy the KV cache from cpu memory to gpu memory. Since I have test the following…

digbangbang updated 4 days ago
1
ahmetoner/whisper-asr-webservice #234

Missing CUDA toolkit

UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower median kernel implementation...

thb10086 updated 3 weeks ago
5

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for cuda-kernels

1000+ results
for cuda-kernels