This PR majorly refactors the compute-location related rule/primitive/analysis, including redesign of the primitive. After the refactor, the schedule rule RandomComputeLocation supports transforming blocks with multiple consumers, which wasn't supported before.
With the help of this PR, performances on NRM and SFM are aligned with Ansor's.
However, there still exists some gap in C2D, which needs future investigation.
Most of the code changes are from the unit tests.
Thanks @junrushao1994 for the help throughout this PR.
This PR majorly refactors the compute-location related rule/primitive/analysis, including redesign of the primitive. After the refactor, the schedule rule RandomComputeLocation supports transforming blocks with multiple consumers, which wasn't supported before.
With the help of this PR, performances on NRM and SFM are aligned with Ansor's.
However, there still exists some gap in C2D, which needs future investigation.
Most of the code changes are from the unit tests.
Thanks @junrushao1994 for the help throughout this PR.