Closed xinji1 closed 1 month ago
You are welcome to give it a try. And we are going to deprecate this repo very soon since it's merged with upstream triton https://github.com/openai/triton. Regarding tl.load, the semantics on amd and nv paths should be the same. Free feel to report if it doesn't work in your case.
looks like it doesn't work, like the issue. I've tried
1.
b,c,d=tl.load(a + tl.arange(0,3))
a = tl.load(a + tl.arange(0,3))
b = a[0]
the former cannot identify `b,c,d` with `b is not defined`; the latter report error like `'constexpr' object is not iterable`. I'll close this issue if it's same for amd XD.
Just learned from this issue. Given a tensor
a
, with sizesize(3,1)
, It looks like amd's triton can load it with something like :