-
Local ROCm version: 5.2.5.1
hipBLASLt version used in build: release/rocm-rel-5.5
Python version: 3.10
CPU: POWER9
GPU: gfx906
The hipBLASLt requirement arose for us re: [bitsandbytes-rocm/ops.…
-
### What is the issue?
Many models, in particular codegemma 1.1 7b q8_0, don't load for various reasons on versions after 0.1.132. Works fine on 132. I don't have the logs on hand at the moment, but …
-
Occupancy calculator API is an invaluable asset in CUDA. Unfortunately `hipOccupancyMaxPotentialBlockSize` is only exposed to Nvidia GPUs for the time being. It would be immensely helpful if it is imp…
-
Pretty much what it says in the title. I just need clarification on what your setup is exactly so I can follow that and reproduce your environment. Thanks in advance for your help!
**Fedora 40 Stab…
-
RuntimeError: Failed to find native CUDA module, make sure that you compiled the code with K2_WITH_CUDA.
-
### Problem Description
Error:
Blender crashes with `Memory access fault by GPU node-1 (Agent handle: 0x7c1aea898400) on address 0x5fd000. Reason: Page not present or supervisor privilege.`
…
-
I am building 46bc9bde22056900f18b725776c6f6c660355e9a on Ubuntu 20.02 Hades Canyon NUC.
I installed ROCm using Apt and it works fine.
I installed LLVM and hipSYCL into `/opt/hipSYCL` using the …
-
**Describe the problem**
I get the standard CUDA errors such as:
```
[10:18:51.765156][warning] Could not determine number of CUDA cards in this system
[10:18:51.765156][warning] No CUDA-Enabled G…
-
For the test in `/triton/python/test/unit/tools/test_aot.py`, there's an error like:
![image](https://github.com/ROCm/triton/assets/147899933/06da6dde-cfba-4a00-b6e9-1785c420dab0)
```
Traceback …
-
### 🐛 Describe the bug
`foreach` is much slower than for-loop in ROCm.
Tested on an MI250x:
```bash
$ python benchmark_optimizers.py
eager: 789.9033987810321us
foreach: 1289.8528449295554u…