fd.add_output(out_tv, stride_order) allows us to set a stride order for the output through the fusion definition.
The current setup errors out if out_tv has any reduction axis.
This PR:
Accounts for presence of reduction axis, and keeps their position in the allocation domain same as in logical domain.
~Sets contiguity to false if the stride_order is not trivial.~
fd.add_output(out_tv, stride_order)
allows us to set a stride order for the output through the fusion definition. The current setup errors out ifout_tv
has any reduction axis. This PR:false
if the stride_order is not trivial.~