-
Hi,
I noticed that the nvptx repo was using wrapping_add, which was curious. In looking into it, I noticed that the type of the intrinsics seems to differ from what the CUDA guide states. Is this i…
-
The problem: Can't install giotto-tda, giotto-ph, giotto-time etc on aarch64 (NVIDIA Jetson) architectures. I get the error:
ERROR: Could not build wheels for giotto-tda.
The reason: Allowing …
-
Learn-CUDA-Programming/tree/master/Chapter02/02_memory_overview/01_sgemm)/sgemm.cu
segmm_gpu_kernel:
sum += A[i + row * K] * B[col + i * M];
I think it should be M
-
-
Hi,
There is unified memory for Pascal card (1060 GTX is at 250usd...), in the CUDA 8 API, there is no
need to do memory transfer and latency is very low.
Is there any plan to map Cuda 8 functionali…
-
```
Aparapi's current goal is to be yet another
easily-write-a-crappy-GPGPU-implementation framework. These are a dime a dozen,
are useless, ignored by mainstream developers, and shunned by HPC deve…
-
Operating system:ubuntu workstation
Compiler:gcc (Ubuntu 5.4.1-2ubuntu1~16.04) 5.4.1
Hi~
I've done the 'cmake' after I installed BLAS.
However, I got some error message when doing 'make' :
"…
-
ROCm math library implements all "pi" functions: sinpi, cospi, tanpi and their inverse (including atan2pi)
hip seems to provide access only to sinpi and cospi.
see
https://github.com/RadeonOpe…
-
Hi,
Your instruction says "make all" but I don't see any Makefile..
Sorry I am just a newbie in CUDA programming.
-
### Problem Description
The two programming models OpenMP and HIP provided by ROCm leverage the same HSA runtime. HIP holds its own pool HSA queues controlled by the `GPU_MAX_HW_QUEUES` environment…