cuda-graph Search Results

1000+ results
for cuda-graph

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch-labs/LeanRL #14

Stable versions of torchrl/tensordict still getting internal…

I am currently getting the same issue in #10. I have torch 2.5.1, torchrl 0.6.0, tensordict 0.6.0 at the moment. I am running a slightly modified version of the original code. I can run with cudagr…

StoneT2000 updated 2 days ago
5
pytorch/pytorch #140219

cpp_wrapper and triton.debug_sync_graph errors out

### 🐛 Describe the bug Enabling both these options causes an error. Code: ``` import torch batch_size = 32 seq_length = 50 hidden_size = 768 def test_fn(): inp = torch.randn(batch_size,…

exclamaforte updated 2 weeks ago
1
pytorch/pytorch #137844

> if graph capture is thread local

> if graph capture is thread local Graph capture is [initiated on a Cuda stream](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__STREAM.html#group__CUDART__STREAM_1g793d7…

HPC4AI updated 1 month ago
3
microsoft/onnxruntime #22583

C# API: Stale results after first run when using TensorRT EP…

I'm trying to run inference on a test model with TensorRT EP in C# with CUDA graph option enabled. I couldn't find a canonical example in C#, so I tried to port the [official C++ example](https://onnx…

vladd updated 4 days ago
4
conan-io/conan #17289

[question] Configuring CUDA projects with Conan CMake toolch…

### What is your question? **What is your question?** I'm developing a CUDA project using conan, CMake and MSVC build tools. After update from CUDA 11.x to CUDA 12.6, any project using conan CMake…

JakubOchnik updated 3 weeks ago
2
YukeWang96/TC-GNN_ATC23 #5

Cuda Graph optimization

hi，I used nsight system to view the timeline after using cuda graph and found that the spmm kernels in the forward and backward passes were clustered together, which seems to break the logic of the pr…

plant310 updated 1 year ago
2
nsidn98/InforMARL #27

RuntimeError: Function ViewBackward returned an invalid grad…

im trying to run the project on other machine with cpu then this shows up i change config file ``` parser.add_argument( "--cuda", action="store_false", default=Fales, …

s1101818 updated 2 days ago
9
pytorch/pytorch #139548

flex attention compile fails

### 🐛 Describe the bug The following code generates the compile error below: ``` import code import time import warnings import numpy as np import torch from torch.nn.attention.flex_attent…

clessig updated 3 weeks ago
3
pytorch/pytorch #140220

cpp_wrapper and triton.debug_sync_kernel errors out

### 🐛 Describe the bug Similar to https://github.com/pytorch/pytorch/issues/140219 these options also fail. Code: ``` import torch batch_size = 32 seq_length = 50 hidden_size = 768 def tes…

exclamaforte updated 2 weeks ago
1
NVIDIA/nccl-tests #263

Test CUDA failure common.cu:941 'invalid device ordinal' whe…

I compile nccl-tests with the command: ```shell make MPI=1 MPI_HOME=${NVHPC_ROOT}/comm_libs/12.4/hpcx/hpcx-2.19/ompi NCCL_HOME=${NVHPC_ROOT}/comm_libs/nccl CUDA_HOME=${NVHPC_ROOT}/cuda ``` And run th…

heya5 updated 3 weeks ago
3

上一页 1...3 4 5 6 7 8 9...100 下一页

1000+ results for cuda-graph

1000+ results
for cuda-graph