-
I can see that the generated Triton code results in inaccurate numerics for both supported backends - i.e., `cuda` and `hip`.
The original, `03-matrix-multiplication.py` compares numerics using `to…
-
build env : ubuntu 22.04 + cmake 3.29.3 (distro cmake version was not enough for build system)
Process :
```
git clone https://github.com/lamikr/rocm_sdk_builder.git
cd rocm_sdk_builder
git ch…
-
**Summary**
CUDA's kernel launch mechanism requires each kernel's device stub function to have a unique address. When targeting Windows, the linker defaults to performing identical COMDAT folding (…
-
Hello guys..
How do I add HIP support to an existing cmake project.
I followed this issue [#231](https://github.com/ROCm-Developer-Tools/HIP/issues/231) and had tried 4 approaches, but only one appr…
-
### 🐛 Describe the bug
I am not gona typed here all things but I am gona summary before I gave link of I opened on other issues or discussions on github
I tried my gfx906 Radeon VII card with webui …
-
I'm studying scaling of a variety of compute workloads, and I was wondering if there is a way to adjust the number of CUs?
I came across a [similar discussion](https://github.com/RadeonOpenCompute/…
-
I install ROCm, use the tutorial: https://rocmdocs.amd.com/en/latest/deploy/linux/os-native/install.html
it seems success. when I input ```rocminfo```, that is the output:
```
ROCk module is loaded…
-
The following LLVM IR:
```llvm
; reduced.ll
target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:3…
-
Create table for words and nouns. Query it when generating usernames
Reference: https://stackoverflow.com/questions/8674718/best-way-to-select-random-rows-postgresql
-
The RMS norm implementation below achieves well below the peak possible memory bandwidth on MI300. The results can be reproduced using the code below.
When benchmarking BabelStream on the same dev…