NVIDIA / Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
Other
271 stars 53 forks source link

in extent substitution preseg pass, use DisjointSets of extents which is constructed from DisjointSets of IterDomains #3379

Closed liqiangxl closed 1 week ago

liqiangxl commented 1 week ago

Fix #3369 What's in this PR? (1) Modification of the extent substitution preseg pass to use DisjointSets of extents instead of DisjointSets of IterDomains (2) Added a function to build DisjointSets of extents from DisjointSets of IterDomains (3) Self-defiend hash function and equal functions are used when construct DisjointSets to ensure const extents are treated as equal if they have the same value. This is not required. We can still treat them as different vals since they have different address and var names. Results (1) Issue fixed, added a python test of the original issue (2) Added two additional cpp tests, one test a same symbolic val is used in two different Id sets, the other test a same const val is used in two different Id sets.

liqiangxl commented 1 week ago

!test --diff --diff-bench

liqiangxl commented 1 week ago

There are two types of code changes: (1) nvfuser-ci/jit_codegen_diff_20_3/7 — Failing after 21 minutes https://nv/e2E/121510962 where order of output tvs is switched. (2) Other three code changes are related to var name in rng ops.ExactMappedExtentSubstitutionPass doesn't change anything in DistributedTransformerTest.MultiheadAttention/__bfloat. Lowered fusion is also not changed after After lowerToInlinePtx:

-        T57[0] = rng_uniformf(rng_result, rng_component1181);
+        T57[0] = rng_uniformf(rng_result, rng_component1176);
liqiangxl commented 1 week ago

!test

liqiangxl commented 1 week ago

!test

liqiangxl commented 1 week ago

!test