Question about rKernels

microsoft / nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

MIT License

948 stars 158 forks source link

Hi,

Thanks for open-sourcing this wonderful project. I notice that in the OSDI 2020 paper you mentioned that Rammer can have multiple implementations for the same operator on NVIDIA GPUs and offered matrix multiply as an example, but I notice that Rammer directly invokes the cuBLAS vendor library for executing matrix multiplies. May I know whether Rammer has any alternative matrix multiply implementations to select at runtime? Could you please also give some other examples on different rKernels in Rammer that belong to the same rOperator?

microsoft / nnfusion

Question about rKernels #442