-
Hi,
Thank you for this amazing work. I wanted to set up a conda environment to run inference with the model you provided but I keep running into issues. I've followed the steps in the Dockerfile a…
-
>https://nvidia.github.io/cuda-python/cuda-bindings/latest/api.html
We have documentation for the Driver, Runtime and NVRTC APIs, but none of nvJitLink. It should be added.
-
### 🐛 Describe the bug
The test case [`test_multi_device_cpu`](https://github.com/pytorch/pytorch/blob/1b95ca904f5020ad8649677cbef683fac9d8e768/test/inductor/test_aot_inductor.py#L304C1-L314C50) actu…
-
The message:
```
python3 deepy.py generate.py -d configs tokenformer/1-5B_eval.yml text_generation.yml --text_gen_type interactive
[2024-11-07 00:21:17,970] [INFO] [real_accelerator.py:161:get_…
-
**Describe the bug**
Any use of `shfl_sync` throws an error saying `shfl_recurse` is a dynamic function.
**To reproduce**
The Minimal Working Example (MWE) for this bug:
Attempting to do a stream…
-
### Describe the issue
Hey everyone,
I was testing a model for face occlusion and I am getting different results between GPU and CPU.
Happy to help if anyone can give me into the right direction? (…
-
Using an explicit SYCL queue instance for Kokkos::SYCL targeting a GPU results in a SYCL (icpx) error:
```
terminate called after throwing an instance of 'sycl::_V1::runtime_error'
what(): Nati…
-
Without the compiler option: https://github.com/mrakgr/Spiral-s-ML-Library/blob/c5d8a529b210f84dc955a017aeff455c2d27affd/game/leduc/fast_compile.py
With --Ofast-compile=max: https://github.com/mrakgr…
-
### Describe the issue
I try to infer an onnx model in jetson agx orin with Jetpack 6.1. The cuda is 12.6 and the cudnn is 9.3. I find on the website it says onnxruntime-19.0 supports cudnn 9.x but w…
-
### 🐛 Describe the bug
The [`AOTIModelPackageLoader::run`](https://github.com/pytorch/pytorch/blob/b4cc5d38b416c8e74a6ba8f537a75571a3cdd563/torch/csrc/inductor/aoti_package/model_package_loader.cpp#L…