iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.85k stars 614 forks source link

Missing f32->bf16 demotion support for the targets of data-tiling ops #17484

Open hanhanW opened 5 months ago

hanhanW commented 5 months ago

The pass only looks at some named ops. We could have other target operations in generic form or other named ops. We should generalize the pass.

https://github.com/iree-org/iree/blob/main/compiler/src/iree/compiler/GlobalOptimization/DemoteContractionInputsToBF16.cpp

benvanik commented 5 months ago

it may do different things (as it also changes public ABI), but maybe adding an f32->bf16 to https://github.com/iree-org/iree/blob/main/compiler/src/iree/compiler/InputConversion/Common/ConvertPrimitiveType.cpp#L308 could help? or, if it works but does more than you want you could at least take some of the code from it that handles ops more generically

hanhanW commented 5 months ago

Oh, this is mostly a quick experimental flag. Sometimes we want to conditionally select some ops (e.g., contraction ops) and demote their input operands from fp32 to bf16 types. It helps unblock the work when bf16 models are not ready. We can start some work and estimation using fp32 models with the flag.