MLIR-based code generation pipeline

philipportner commented 1 year ago

This PR aims to bring MLIR-based code generation for certain operations to DAPHNE.

As an alternative way to implement DAPHNE operators this PR provides a code generation pipeline which progressively lowers the DaphneIR available after parsing the DaphneDSL script to operations in either the same dialect or operations from other dialects. With that, DAPHNE can optionally replace certain kernels by generating code directly, and also perform a hybrid compilation approach where we mix kernel calls with code generation in order to exploit advantages of both, precompiled kernel libraries and code generation.

This includes the following features:

Documentation (Codegen.md)
Code generation passes src/compiler/lowering/
Kernels to facilitate interoperability between the memref-dialect and the DenseMatrix runtime object
CLI arguments to enable code generation (--codegen and --codegen --hybrid) and explain-passes (--explain code_gen)
End to end tests that run daphne with a DaphneDSL script as input in test/compiler/e2e/
llvm-lit based testing of the IR produced from passes / pass pipelines in test/compiler/lit/ using intermediate IR as input (.mlir files).
The daphne-opt tool, DAPHNEs own version of the mlir-opt tool with its source files in daphne-opt/ and a new build target ./build.sh --target daphne-opt.

daphne-opt is included in this PR since the lit tests require the daphne-opt target. With daphne-opt one can test passes in isolation by providing the input IR for a pass to daphne-opt and the correct flag to run the pass (or a pass pipeline) on the IR. The daphne-opt tool inherits all the functionality of the mlir-opt tool.

daphne-opt --lower-ew --debug-only=dialect-conversion ew.mlir performs the LowerEwOpPass on the input file ew.mlir while providing dialect conversion debug information.

The list of affected operations (only when daphne is executed with the --codegen flag):

MatMulOp
MapOp
Ew(Sqrt | Abs | Add | Sub | Mul | Div | Pow | Mod)Op
AllAggSumOp
ConvertDenseMatrixToMemRef
ConvertMemRefToDenseMatrix

daphne --codegen --hybrid can be used to disable code generation for certain operators in order to run a hybrid compilation pipeline (currently only the MatMulLoweringPass is affected by this.)

corepointer commented 1 year ago

Hi @philipportner, I've been following this PR a bit, as the topic is included in my line of work ;-) But I haven't tried it out yet :angel: I saw the note about the container update - will do. We need to fix the name clash as I named my codegen also codegen :see_no_evil: What about "MLIR codegen" and "CUDA codegen" ? Altough the code I produce could be used as normal C++ kernels as well with some modification :-P But as a working title it'll be sufficient. Best regards, Mark

philipportner commented 1 year ago

Hi @corepointer , thanks for looking into the container update!

Yes, we should fix that. I'll change mine to --mlir-codegen.

BR, Philipp

philipportner commented 1 year ago

@corepointer thanks for adding lit support!

I see you removed the symlinks I added to the main.yml file, I suppose this was the wrong way to add these and there is another way this should be done?

- name: "Create symlinks"
      run: |
        ln -s bin/daphne /usr/bin/daphne
        ln -s bin/daphne-opt /usr/bin/daphne-opt
        ln -s thirdpart/build/llvm-project/bin/FileCheck /usr/bin/FileCheck

Preferably, the *.mlir files used by lit can contain the RUN: directive without hardcoding the path to the binaries, e.g., // RUN: daphne-opt --lower-ew %s | FileCheck %s. Making them absolute path that work on the CI system probably means one has to change something to make them work locally. Relative paths add the burden of making sure that they are always correct during local development and work on other systems. Simply being able to write the executable name in the RUN: directive would be the best solution for developers.

corepointer commented 1 year ago

Hi! It all worked just fine locally by setting the PATH variable correctly - no symlinking needed. I tried to port that change from test.sh to main.yml because we're invoking run_tests manually. I have to look at it again why it failed in the github actions.

philipportner commented 1 year ago

Hi! It all worked just fine locally by setting the PATH variable correctly - no symlinking needed. I tried to port that change from test.sh to main.yml because we're invoking run_tests manually. I have to look at it again why it failed in the github actions.

Hey @corepointer , let me have a look. I think this is actually something that should be done in the lit-config. I didn't realize that you can do that there. We have all the executables required in bin/ and thirdparty/build/ so that should work without any changes to the main.yml.

corepointer commented 1 year ago

The executables from LLVM should already provided by the llvm-10* Ubuntu packages - that's how I tested it with the docker image locally and that worked. I added some debug output to see what runLIT() is actually doing.

philipportner commented 1 year ago

The executables from LLVM should already provided by the llvm-10* Ubuntu packages - that's how I tested it with the docker image locally and that worked. I added some debug output to see what runLIT() is actually doing.

Okay, that may work for FileCheck but the lit-based tests also need to run the new daphne-opt compiler driver. I added the paths to the lit.cfg and that should now work even if the llvm-* tools are not part of $PATH, though, seems like the lit python package is still missing. I don't think lit is part of the llvm-10* installation (see https://discourse.llvm.org/t/llvm-lit-not-built/3546/5?u=philipportner). As I anyway run this run-lit.py script through python3 in the test I think the lit PyPi package needs to be installed.

corepointer commented 1 year ago

The llvm-10-tools package was missing from the github-actions docker image. I thought that was included in the llvm meta package but it was not. The lit tests work now. The paths in lit.cfg are not needed (as it worked before without that). The lit stuff is pointed to by PYTHONPATH and the daphne-opt binary is available via $PWD/bin in PATH.

You might want to squash these 151 commits into a few topic commits (or rather sub-topics). Otherwise it will most likely end up in one huge "MLIR codegen" commit during the merge to master - not strictly forbidden but maybe it helps in error tracking some day.

pdamme commented 1 year ago

By the way: I also applied a few smaller changes myself. Feel free to have a look.

pdamme commented 1 year ago

So I think we are very close to merging this PR in. Again, those optional points can be addressed later after the merge; we should just not forget about them ;) .

philipportner commented 1 year ago

Thank you for the review @pdamme! I'll implement the requested changes tomorrow. I think the optional changes also make a lot of sense so I'd rather fix them now too :)

pdamme commented 1 year ago

Thanks for the feedback, @corepointer, and for incorporating it, @philipportner. I'll merge this PR now.

daphne-eu / daphne

MLIR-based code generation pipeline #633