I am getting the following error, which I am unsure how to debug:
python: /projects/triton/lib/Dialect/TritonGPU/IR/Dialect.cpp:85: llvm::SmallVector<unsigned int> mlir::triton::gpu::getThreadsPerWarp(const mlir::Attribute&): Assertion `0 && "getThreadsPerWarp not implemented"' failed.
fish: Job 1, 'python error.py' terminated by signal SIGABRT (Abort)
While trying to produce a minimal example (code below), I removed line 91 (acc += tl.sum(p_jik[:, :, :, None] * v_ik_data[None, :, :, :], 2)), which caused the python process to hang for >1h at 100% CPU usage, before crashing.
Triton is installed from 6413c7b9debf9e82b9f2df4dc1688a66427b8064.
I am getting the following error, which I am unsure how to debug:
While trying to produce a minimal example (code below), I removed line 91 (
acc += tl.sum(p_jik[:, :, :, None] * v_ik_data[None, :, :, :], 2)
), which caused the python process to hang for >1h at 100% CPU usage, before crashing.Triton is installed from
6413c7b9debf9e82b9f2df4dc1688a66427b8064
.Code to reproduce: