huggingface / candle

Minimalist ML framework for Rust
Apache License 2.0
15.03k stars 875 forks source link

How to introduce openai triton in candle? #1463

Open bofen97 opened 8 months ago

bofen97 commented 8 months ago

The handwritten CUDA operator is very complicated. How can we use openai triton in candle to simplify this process. :)

LaurentMazare commented 8 months ago

Triton can be used to generate PTX code and candle uses PTX for its current kernels (except that they are generated using the nvcc compiler rather than triton) - see for example this file. It shoud be pretty straightforward to hook triton generated ptx at the same place, what would be good is someone writing some tutorial material to help users doing this.

bofen97 commented 8 months ago

Yes, I will try to do it. At present, the documentation of candle needs to be improved.

jeromeku commented 8 months ago

@LaurentMazare @bofen97 Happy to work on this as well. Been working on codegen for Triton kernels that support frameworks other than PyTorch. See here and here.

jeromeku commented 8 months ago

@LaurentMazare @bofen97

Created a minimal example of loading a Triton kernel in Rust.

Lots to be done to make interfacing with Triton generated kernels more ergonomic per the notes in the aforementioned repo. Let me know this is something of interest -- happy to explain / add more detailed examples.

bofen97 commented 8 months ago

minimal example

Thanks, I will study your code.