microsoft / antares

Antares: an automatic engine for multi-platform kernel generation and optimization. Supporting CPU, CUDA, ROCm, DirectX12, GraphCore, SYCL for CPU/GPU, OpenCL for AMD/NVIDIA, Android CPU/GPU backends.
Other
439 stars 45 forks source link

Benchmarks #360

Open sebastienwood opened 2 years ago

sebastienwood commented 2 years ago

Hi ! I was wondering if there were some benchmarks available to compare the performance of Antares with e.g. the scripts from https://github.com/microsoft/antares/tree/v0.3.x/frameworks/pytorch/examples across a set of backends that the user may not have at hand.

Particularly, I'd be interested to know how Antares convolution fares with respect to cuDNN.

Thanks !

ghostplant commented 2 years ago

We made some early benchmark scripts on tensorflow here, but they lack maintaining after v0.3.x and may no longer work in latest version. And we didn't do the same in pytorch yet. However, by the end of this year, we'll release v0.4 that uses a different framework to show benchmark against torch in a fair and nice way.

SamuelMarks commented 1 year ago

Any news here? - Keen to see some benchmarks! :)

ghostplant commented 1 year ago

Hi. The benchmarks should be faster than TVM/Ansor but may not ALWAYS faster than CUBLAS. And currently no tensor-core supported.

May I know your purpose of the benchmarks so I can provide more concrete explanation? Like:

What platform you want to use for, e.g. DirectX / Windows ROCm? What operators you want to use for,e.g. LSTM layer / Depth-wise Conv / MoE dispatch / Matmul relu? What purporse for the benchmark numbers,e.g. papers / use in production?