-
### Issue type
Bug
### Have you reproduced the bug with TensorFlow Nightly?
No
### Source
binary
### TensorFlow version
v2.17.0-rc1-2-gad6d8cc177d 2.17.0
### Custom code
Yes
### OS platform …
-
Building `cuda.parallel` is quite brittle due to requirements from the C library. Through some patient trials and errors I discovered the following build-time dependencies are required:
- gcc 13+ (due…
-
Extending the work done in https://github.com/pyccel/pyccel-cuda/issues/34 to be able to work with `Unified` memory.
Using `cudaMallocManaged` see [Unified Memory](https://developer.nvidia.com/blog/u…
bauom updated
3 months ago
-
This issue aims to add the ability of memory transfer using the Pyccel Cuda internal library.
API for [cudaMemcpy](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html#group__CUDA…
bauom updated
3 months ago
-
Extending the work done in https://github.com/pyccel/pyccel-cuda/issues/34 to be able to work with `device` memory.
Using [cudaMalloc](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEM…
bauom updated
3 months ago
-
This issue aim to add a `cuda` Internal library that can be imported as `from pyccel.internals import cuda` that give access to an array method that can `allocate` an array in the `host` memory.
We…
bauom updated
3 months ago
-
### Describe the issue
i'm using cascade mask rcnn model in detectron2. when export onnx, it has RoiAlign (opset 16 version) in model file.
when running on onnxruntime (Cuda EP), it's too slow since…
-
Following is my environment setting.
```
System: Ubuntu 20.04
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Wed_Apr_17_19:19:55_PDT_2024
Cuda compi…
-
```
➜ RUST_BACKTRACE=1 cargo run --example gelu -F cuda
Finished `dev` profile [optimized + debuginfo] target(s) in 0.22s
Running `target/debug/examples/gelu`
thread 'main' panicked at /…
-
1
![Snipaste_2024-08-31_16-15-46](https://github.com/user-attachments/assets/5442ca84-26ab-4dd6-b376-8acbef26b81b)
I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA…