NVIDIA / Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
Other
271 stars 53 forks source link

Prefer simpler vals in replace sizes #3344

Closed naoyam closed 2 weeks ago

naoyam commented 2 weeks ago

Noticed while working on #3309 that i0 is replaced with ceilDiv(i0, 1). While it isn't incorrect, it would make generated code look simpler if ceilDiv(i0, 1) is replaced with i0.

This PR just changes representative iter domains used for replacing extents. In addition to the existing priority rules, the iter domain with the simplest extent is preferred as the representative ID of a given ID group. The simplicity of extents is just defined based on the number of expressions defining the extent val. So, for example, an iter domain with extent of i0 should be used as the representative ID instead of iter domains with extent ceilDiv(i0, 1).

There should be no logic change.

naoyam commented 2 weeks ago

!test --diff-bench

naoyam commented 2 weeks ago

!build

naoyam commented 2 weeks ago

!test