tenstorrent / tt-mlir

Tenstorrent MLIR compiler
https://tenstorrent.github.io/tt-mlir/
Apache License 2.0
78 stars 13 forks source link

Forcing non-tile-aligned reshapes to ROW_MAJOR #827

Open LPanosTT opened 2 months ago

LPanosTT commented 2 months ago

If the desired shape in a reshape op has either of the last two dims not tile dim aligned, they must be in ROW_MAJOR. The compiler does not yet model memory config/layout. After #818 is merged, this workaround will be in place:

if (in.get_layout() == ::ttnn::TILE_LAYOUT &&
      (shape[shape.size() - 2] % 32 != 0 ||
       shape[shape.size() - 1] % 32 != 0)) {

    in = ::ttnn::to_layout(in, ::ttnn::ROW_MAJOR_LAYOUT, std::nullopt,
                           std::nullopt, in.device());
  }

In runtime/lib/ttnn/operations/data_movement/reshape.cpp

sdjordjevicTT commented 2 months ago

We have an open issue on the TT-Metal side to fix this: https://github.com/tenstorrent/tt-metal/issues/12866

LPanosTT commented 2 months ago

@sdjordjevicTT That seems like a different issue. In my case the reshape call will throw an error rather that giving garbage output

sdjordjevicTT commented 2 months ago

It is probably because we didn't specify the tile output shape properly(including tile padding). Once we do that, you will get the wrong results as well.

dgolubovicTT commented 1 month ago

@LPanosTT do we have some simple repro that I can check when consume new metal version, to test if it works?

LPanosTT commented 1 month ago

@LPanosTT do we have some simple repro that I can check when consume new metal version, to test if it works?

I supposed creating a 10x10 tensor then setting it to tile layout, then reshaping it to 1x100 would show whether or not it was fixed. You'll have to check that the data is correct of course.

sdjordjevicTT commented 1 month ago

@dgolubovicTT please sync with @mtopalovicTT and @jserbedzijaTT. Metal folks did an additional fix for this a day ago, hence we should verify this once we uplift the metal to the latest.