tracel-ai / burn

Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals.
https://burn.dev
Apache License 2.0
8.66k stars 429 forks source link

Benchmarks WGPU: Add benchmarks for reduce operations #584

Open mmalczak opened 1 year ago

mmalczak commented 1 year ago

Feature description

Add benchmarks for reduce operations: Reduce one dimension:

Reduce full tensor to a scalar:

There is an open issue to improve the performance of reduce kernels: https://github.com/burn-rs/burn/issues/536 Before starting to work on performance, we need proper benchmarks.

vini-fda commented 9 months ago

I'm interested in this, and I've already done some early experimentation to measure the performance of burn externally (i.e. using it as an external crate) and compare it with my own implementation. I have a few questions about these internal benchmarks before contributing:

nathanielsimard commented 9 months ago

There are some benchmarks in burn-wgpu/benches/reduction.rs, but we could put them in backend-comparison instead. We are missing benchmarks for global reduction such as mean and sum.