Open hanhanW opened 7 months ago
also cc @qedawkins who has been looking at data-tiling for GPU cases.
So at flow level, we will see
%lhs = iree_linalg_ext.set_encoding %orig_lhs : tensor<?x?xf32> -> tensor<?x?xf32, #iree_linalg_ext.encoding<role = LHS, element_types = [f32, f32, f32], user_indexing_maps = [#map, #map1, #map2]>> %rhs = iree_linalg_ext.set_encoding %orig_rhs : tensor<?x?xf32> -> tensor<?x?xf32, #iree_linalg_ext.encoding<role = RHS, element_types = [f32, f32, f32], user_indexing_maps = [#map, #map1, #map2]>> %matmul = linalg.matmul ins(%lhs, %rhs : ... ) outs(%init : tensor<?x?xf32>) %elem = linalg.generic ... ins(%matmul ...) outs(...)
So will the result tensor of the linalg.matmul
not have an encoding attribute? It seems like this is tying the unset encoding too closely to the matmul. Especially if we start thinking about propagation of encodings, this will restrict what we can do with unset encoding. I think unset encoding needs to be its own operation to avoid a semantic discontinuity between flow and codegen, since unset encoding will ultimately materialize into its own operation.
We wont have unset_encoding ops because the output operands don't have encodings.
@MaheshRavishankar @bjacob and I had a discussion today about not having unset_encoding ops at Flow. This can make fusion logics simpler; it also make mmt4d fusion easier. The proposal is only setting encoding on LHS and RHS, but not RESULT. In the context, we still create unpack ops during MaterializeEncoding. We create unpack ops together with mmt4d from materializing encodings. So at flow level, we will see
We can form
%matmul
and%elem
into the same dispatch. The codegen input become:Then it get materialized into
[optional] At this moment, there is an opportunity to push down
%unpack
across%elem
op. So it can get closer toflow.dispatch.tensor.store
.Then we apply the transformations like what we have today. The mmt4d op still can be lowered to ukernels after distribution. The rest of ops are codegen-ed.
If we don't push down the
%unpack
op, the same thing happens here. The mmt4d op can be lowered to ukernels, and the rest of ops are codegen-ed. It just depends on how codegen want to handleunpack + elem
.(cc @Max191 @pashu123 )