openxla / xla

A machine learning compiler for GPUs, CPUs, and ML accelerators
Apache License 2.0
2.75k stars 440 forks source link

[XLA:GPU] Add intra-warp reduce of reduce test. #19840

Closed copybara-service[bot] closed 5 days ago

copybara-service[bot] commented 5 days ago

[XLA:GPU] Add intra-warp reduce of reduce test.

Add a reproducer from b/380277401 as a test to make sure it doesn't get broken again later.

Reduce op lowering needs special handling if the input parameter has slice layout. The issue [0] was fixed in upstream Triton in June 2024 [1], but later lost and re-fixed in [2].

[0] https://github.com/triton-lang/triton/issues/4116 [1] https://github.com/triton-lang/triton/pull/4139 [2] https://github.com/triton-lang/triton/pull/5080