-
The Triton tutorial 03-matrix-multiplication.py starts to fail after a recent software update to oneAPI 2024 with 'total scratch space exceeds HW supported limit".
ocloc -spirv_input -file matmul_k…
-
Pip version used for fetching Tracy profiles :
```
iree-compiler 20231106.574
iree-runtime 20231106.574
```
I captured an end-to-end Tracy profile along with Unet for R…
-
### What happened?
I'm trying to compile a tm_tensor model through iree-compile, and this is the output:
```
Please report issues to https://github.com/openxla/iree/issues and include the crash b…
-
Hello,
If I were to have a novel accelerator, I would need to have a programming model for users to interact with it. From my understanding, AMD wants HIP to be a defacto programming model for the …
-
It seems like Intel Arc support is supposed to be present — it shows up as a Vulkan device at least! But in trying to run this I get compile failures Here's the detailed log from the command prompt on…
-
In the sm_70.cu and sm_80.cu CuTe examples, am I understanding correctly that they are tiling ordinary float multiplication, and not any complex MMA instruction?
https://github.com/NVIDIA/cutlass/…
-
Create a LLVMIR file that uses DPAS, 2d block read/write, and prefetch instructions to get an estimate initial performance at SIMT path.
One could use compiler to generate the LLVM IR file, but not…
-
Hi Dears,
This is an RFC for supporting the Intel GPU in Triton language.
The tool chain for Intel GPU is based on the SYCL/SPIRV spec. As a result, we'd like to upstream the SPIRV support to th…
-
I am using a modified version of Accelsim which was fine with GCC-9.4 and CUDA 11.6 (driver and toolkit). Recently, I have upgraded the system with CUDA 12.2 (driver) and 11.6 (toolkit) and I was able…
-
Hi, accel-sim devolopers:
- I am newer for gpu simulation, I am running the steps of README.md instructions, at the step of ./util/job_launching/run_simulations.py -B rodinia_2.0-ft -C CONFIG -T…