Open wujingyue opened 3 months ago
The rfactor ID, the product of the slicing, should have contiguity true/false instead of inheriting none from the root ID.
A second thought: Ideally, a slice of an expanded broadcast should still be an expanded broadcast with contiguity none
and a smaller expanded extent. This way, the output is still an alias and therefore needn't allocated. This will need some cooperation from https://github.com/NVIDIA/Fuser/blob/6c6f3a40e09e6f8bece80b9b79c543945846c71b/csrc/ops/alias.cpp#L787-L795.
Check out
wjy/slice
and_bn && bin/nvfuser_tests --gtest_filter=AliasTest.SliceOfExpandedBroadcast
.The bug is somewhere in https://github.com/NVIDIA/Fuser/blob/7a6f19cce1cf0167700047ca7eb58f53d71bc731/csrc/alias_analysis.cpp#L305-L330.
The root ID is an expanded broadcast and therefore has contiguity
none
. The rfactor ID, the product of the slicing, should have contiguitytrue/false
instead of inheritingnone
from the root ID.