Include default tuning specs with the compiler

kuhar commented 2 days ago

We want to be able to ship a library of default tuning specs with IREE, so that users can get good performance out of the box on known key operations. This is applied after dispatch formation and realized around the time we run the MaterializeUserConfigs pass. Currently, there's not default spec provided, but the users can supply their own transform dialect libraries.

I propose the following solutions, which bears similarity with how we plan to handle ukernels. There are a lot of things that still need to be worked out, but this is roughly the brake-down of of key requirements/properties:

We will provide default tuning specs for key architectures like gfx942. Each architecture will have its own tuning spec file.
During IREE build, these tuning specs will be given to iree-opt, verified, and saved as mlir bytecode files. These bytecode files will be embedded into the final iree compiler binary. At runtime, the compiler will be able to access them as memory buffers.
Just before MaterializeUserConfigs, we will load the compatible tuning spec, if any, and the user-provided transform libraries.
We will put both transform dialect libraries in the same module, so that there's only one library to handle.
We will then materialize the transform library in an opaque resource similar to #hal.executable.object. We don't want to embed the library as a nested module to prevent accidental visitation with walk/pattern rewrite drivers, as the transform libraries can contain ops very similar to the kernel code (e.g., linalg.generic).
We will run the transform dialect interpreter, like we do today, in MaterializeUserConfigs.
New tests will make sure that the default tuning specs are up-to-date and in a working state. In case of minor syntactic changes, it will be the responsibility of the patch author to update the specs. In case of substantial upstream mlir dialect changes, the author of the mlir change will be responsible for the upgrade. If that person is not an IREE contributor, the responsibility would then fall on the author of the tuning spec.

kuhar commented 2 days ago

cc: @MaheshRavishankar @benvanik @bjacob @stellaraccident

MaheshRavishankar commented 2 days ago

Thanks Jakub. This is a great summary and description. Few comments below.

During IREE build, these tuning specs will be given to iree-opt, verified, and saved as mlir bytecode files. These bytecode files will be embedded into the final iree compiler binary. At runtime, the compiler will be able to access them as memory buffers.

I dont think they should be included as part of the final iree compiler binary. I think we should be able to put in a location that iree-compile can access it from? We also need to be able to provide/append a way for user to override the tuning spec picked up

Just before MaterializeUserConfigs, we will load the compatible tuning spec, if any, and the user-provided transform libraries.

We will put both transform dialect libraries in the same module, so that there's only one library to handle.

Nice. Thanks for incorporating this. We should build this is a separate utility/tool that can be tested independently (and maybe invoked from within the compiler to be able to append to existing tuning).

Rest of the stuff looks good to me.

cc @erman-gurses @bangtianliu @nithinsubbiah

ScottTodd commented 2 days ago

During IREE build, these tuning specs will be given to iree-opt, verified, and saved as mlir bytecode files. These bytecode files will be embedded into the final iree compiler binary. At runtime, the compiler will be able to access them as memory buffers.

I dont think they should be included as part of the final iree compiler binary. I think we should be able to put in a location that iree-compile can access it from? We also need to be able to provide/append a way for user to override the tuning spec picked up

I think we can figure these details out partway through the implementation work, but starting with brainstorming and design work now is a good idea. Once we have the basic mechanism in place to use tuning specs, how those specs are provided will matter more.

We can survey other similar projects to see what they do and what users will expect. Bundling the files as part of the compiler distribution (either embedded directly in the libIREECompiler.so or in separate files that we package together) will certainly be nice for a self-contained / hermetic compiler. I could see users wanting to maintain their own spec libraries to be shared between multiple developers, in which case we'd want a way to load those libraries from a local file path or even a remote URL. That could get complex with priority loading / ordering if there are multiple matching specs, generic specs that can act as a fallback if no more specific spec matches, etc.

kuhar commented 2 days ago

During IREE build, these tuning specs will be given to iree-opt, verified, and saved as mlir bytecode files. These bytecode files will be embedded into the final iree compiler binary. At runtime, the compiler will be able to access them as memory buffers.

I dont think they should be included as part of the final iree compiler binary. I think we should be able to put in a location that iree-compile can access it from?

IIUC, ukernels are also stored in the compiler binary and this part mirrors that solution. This makes distribution easier because you don't have to carry an additional directory with tuning specs and worry about installing it somewhere relative to the compiler.

We also need to be able to provide/append a way for user to override the tuning spec picked up

My plan is to append all specs (both the default arch-specific one and user-provided one) into one transform library module, and have that embedded in an opaque attribute. I attempted to describe it the bullet points towards the middle.

iree-org / iree

Include default tuning specs with the compiler #19214