Open certik opened 1 year ago
Another idea is to not even go to Triton, but add a PyTorch backend here: https://github.com/pytorch/pytorch/blob/3c46e859aa7a5721322d67b2b9af4c2d47460a18/torch/_inductor/codegen/, similar to the cpp backend. It would generate Python (LPython) code, that LPython can compile.
The high level motivation
Some real world PyTorch benchmarks that we would like to run are at: https://github.com/pytorch/benchmark/tree/64409d5704b6136c6cb28071ff8eba61751b1b02/torchbenchmark/models, those compile via Triton, so if we add Triton to ASR, we'll be able to compile them via LCompilers.
Details
Here are example files with real-world kernels in Triton:
The goal should be to add a frontend that can translate them to ASR. Most likely tapping at the intermediate representation in Triton. Triton seems to be using their own dialect in MLIR as the IR: https://github.com/openai/triton/blob/bf3171f5c735ea216fb624107c807e4e026c5638/python/src/triton.cc, so most likely taking that and transforming to ASR might work (https://github.com/lcompilers/lpython/issues/2341). Triton can print the intermediate representation that it sees, so we should use that.