Can I explicitly specify the "tl.load" to load data into shared memory?

triton-lang / triton

Development repository for the Triton language and compiler

https://triton-lang.org/

MIT License

13.38k stars 1.64k forks source link

Can I explicitly specify the "tl.load" to load data into shared memory? #4320

Open gujiewen opened 4 months ago

gujiewen commented 4 months ago

In gemv, the vector will be frequently used. If the vector is small enough, I want to fix it to shared memory and share it among different warps. However, it seems that tl.load cannot accomplish this? Or are there any other tricks?

manman-ren commented 4 months ago

I don't think we support tl.load into shared memory. Shared memory currently is used by compiler passes, not directly by user. You can add evict policy to tl.load to try to make it persist in cache.