praeclarum / webgpu-torch

Tensor computation with WebGPU acceleration
MIT License
583 stars 17 forks source link

Benchmark Results #4

Closed twodawg closed 1 year ago

twodawg commented 1 year ago

M1 16GB Edge:

Benchmark Time (ms) Intel(R) Xeon(R) W-2150B CPU @ 3.00GHz NVIDIA GeForce RTX 3090 Error
unary 1d(1, 'exp') 3.19 0.0019 0.0171  
unary 1d(1, 'relu') 3.21 0.0020 0.0173  
unary 1d(1, 'sigmoid') 2.82 0.0020 0.0170  
unary 1d(1, 'tanh') 3.27 0.0022 0.0169  
unary 1d(3, 'exp') 3.28 0.0018 0.0171  
unary 1d(3, 'relu') 3.24 0.0019 0.0178  
unary 1d(3, 'sigmoid') 3.25 0.0018 0.0169  
unary 1d(3, 'tanh') 3.29 0.0023 0.0167  
unary 1d(729, 'exp') 3.33 0.0122 0.0235  
unary 1d(729, 'relu') 3.53 0.0121 0.0236  
unary 1d(729, 'sigmoid') 3.43 0.0121 0.0230  
unary 1d(729, 'tanh') 3.35 0.0138 0.0230  
unary 1d(2187, 'exp') 3.42 0.0338 0.0349  
unary 1d(2187, 'relu') 3.31 0.0333 0.0356  
unary 1d(2187, 'sigmoid') 3.48 0.0380 0.0346  
unary 1d(2187, 'tanh') 3.49 0.0403 0.0354  
unary 1d(59049, 'exp') 4.71 0.9219 0.5118  
unary 1d(59049, 'relu') 4.25 0.8622 0.5112  
unary 1d(59049, 'sigmoid') 4.25 0.9294 0.5123  
unary 1d(59049, 'tanh') 4.24 1.0564 0.5108  
unary 1d(177147, 'exp') 6.35 2.6684 1.5115  
unary 1d(177147, 'relu') 6.39 2.5438 1.4988  
unary 1d(177147, 'sigmoid') 6.38 2.7275 1.4965  
unary 1d(177147, 'tanh') 6.36 3.1460 1.4985
xenova commented 1 year ago

RTX 2080 8GB, Windows, Chrome

BenchmarkTime (ms)Intel(R) Xeon(R) W-2150B CPU @ 3.00GHzNVIDIA GeForce RTX 3090Error
unary 1d(1, 'exp')3.410.00190.0171
unary 1d(1, 'relu')3.400.00200.0173
unary 1d(1, 'sigmoid')3.430.00200.0170
unary 1d(1, 'tanh')3.420.00220.0169
unary 1d(3, 'exp')3.410.00180.0171
unary 1d(3, 'relu')3.420.00190.0178
unary 1d(3, 'sigmoid')3.470.00180.0169
unary 1d(3, 'tanh')3.330.00230.0167
unary 1d(729, 'exp')3.410.01220.0235
unary 1d(729, 'relu')3.400.01210.0236
unary 1d(729, 'sigmoid')3.550.01210.0230
unary 1d(729, 'tanh')3.410.01380.0230
unary 1d(2187, 'exp')3.470.03380.0349
unary 1d(2187, 'relu')3.450.03330.0356
unary 1d(2187, 'sigmoid')3.420.03800.0346
unary 1d(2187, 'tanh')3.650.04030.0354
unary 1d(59049, 'exp')5.410.92190.5118
unary 1d(59049, 'relu')5.560.86220.5112
unary 1d(59049, 'sigmoid')5.650.92940.5123
unary 1d(59049, 'tanh')5.701.05640.5108
unary 1d(177147, 'exp')8.822.66841.5115
unary 1d(177147, 'relu')8.382.54381.4988
unary 1d(177147, 'sigmoid')8.842.72751.4965
unary 1d(177147, 'tanh')8.953.14601.4985
mcNets commented 1 year ago

Chrome - GTX1060 - i7 10th g 32 Gb

Benchmark Time (ms) Intel(R) Xeon(R) W-2150B CPU @ 3.00GHz NVIDIA GeForce RTX 3090 Error
unary 1d(1, 'exp') 3.67 0.0019 0.0171  
unary 1d(1, 'relu') 3.77 0.0020 0.0173  
unary 1d(1, 'sigmoid') 3.70 0.0020 0.0170  
unary 1d(1, 'tanh') 3.54 0.0022 0.0169  
unary 1d(3, 'exp') 3.68 0.0018 0.0171  
unary 1d(3, 'relu') 3.62 0.0019 0.0178  
unary 1d(3, 'sigmoid') 3.61 0.0018 0.0169  
unary 1d(3, 'tanh') 3.65 0.0023 0.0167  
unary 1d(729, 'exp') 3.66 0.0122 0.0235  
unary 1d(729, 'relu') 3.71 0.0121 0.0236  
unary 1d(729, 'sigmoid') 3.77 0.0121 0.0230  
unary 1d(729, 'tanh') 3.90 0.0138 0.0230  
unary 1d(2187, 'exp') 3.73 0.0338 0.0349  
unary 1d(2187, 'relu') 3.65 0.0333 0.0356  
unary 1d(2187, 'sigmoid') 3.65 0.0380 0.0346  
unary 1d(2187, 'tanh') 3.65 0.0403 0.0354  
unary 1d(59049, 'exp') 6.71 0.9219 0.5118  
unary 1d(59049, 'relu') 7.05 0.8622 0.5112  
unary 1d(59049, 'sigmoid') 6.88 0.9294 0.5123  
unary 1d(59049, 'tanh') 6.84 1.0564 0.5108  
unary 1d(177147, 'exp') 15.39 2.6684 1.5115  
unary 1d(177147, 'relu') 15.17 2.5438 1.4988  
unary 1d(177147, 'sigmoid') 15.02 2.7275 1.4965  
unary 1d(177147, 'tanh') 15.40 3.1460 1.4985

Edge

Benchmark Time (ms) Intel(R) Xeon(R) W-2150B CPU @ 3.00GHz NVIDIA GeForce RTX 3090 Error
unary 1d(1, 'exp') 4.23 0.0019 0.0171  
unary 1d(1, 'relu') 4.11 0.0020 0.0173  
unary 1d(1, 'sigmoid') 4.07 0.0020 0.0170  
unary 1d(1, 'tanh') 3.92 0.0022 0.0169  
unary 1d(3, 'exp') 4.14 0.0018 0.0171  
unary 1d(3, 'relu') 4.05 0.0019 0.0178  
unary 1d(3, 'sigmoid') 4.00 0.0018 0.0169  
unary 1d(3, 'tanh') 4.08 0.0023 0.0167  
unary 1d(729, 'exp') 4.15 0.0122 0.0235  
unary 1d(729, 'relu') 4.01 0.0121 0.0236  
unary 1d(729, 'sigmoid') 4.06 0.0121 0.0230  
unary 1d(729, 'tanh') 4.03 0.0138 0.0230  
unary 1d(2187, 'exp') 4.15 0.0338 0.0349  
unary 1d(2187, 'relu') 4.23 0.0333 0.0356  
unary 1d(2187, 'sigmoid') 4.19 0.0380 0.0346  
unary 1d(2187, 'tanh') 4.30 0.0403 0.0354  
unary 1d(59049, 'exp') 7.96 0.9219 0.5118  
unary 1d(59049, 'relu') 7.44 0.8622 0.5112  
unary 1d(59049, 'sigmoid') 7.63 0.9294 0.5123  
unary 1d(59049, 'tanh') 7.24 1.0564 0.5108  
unary 1d(177147, 'exp') 15.55 2.6684 1.5115  
unary 1d(177147, 'relu') 15.75 2.5438 1.4988  
unary 1d(177147, 'sigmoid') 15.89 2.7275 1.4965  
unary 1d(177147, 'tanh') 16.28 3.1460 1.4985
twodawg commented 1 year ago

Edge Windows 10 Intel i7 12700K, RTX 4090:

Benchmark Time (ms) Intel(R) Xeon(R) W-2150B CPU @ 3.00GHz NVIDIA GeForce RTX 3090 Error
unary 1d(1, 'exp') 3.21 0.0019 0.0171  
unary 1d(1, 'relu') 3.16 0.0020 0.0173  
unary 1d(1, 'sigmoid') 3.68 0.0020 0.0170  
unary 1d(1, 'tanh') 3.23 0.0022 0.0169  
unary 1d(3, 'exp') 3.14 0.0018 0.0171  
unary 1d(3, 'relu') 3.18 0.0019 0.0178  
unary 1d(3, 'sigmoid') 3.24 0.0018 0.0169  
unary 1d(3, 'tanh') 2.86 0.0023 0.0167  
unary 1d(729, 'exp') 3.09 0.0122 0.0235  
unary 1d(729, 'relu') 3.27 0.0121 0.0236  
unary 1d(729, 'sigmoid') 3.39 0.0121 0.0230  
unary 1d(729, 'tanh') 3.20 0.0138 0.0230  
unary 1d(2187, 'exp') 3.06 0.0338 0.0349  
unary 1d(2187, 'relu') 3.66 0.0333 0.0356  
unary 1d(2187, 'sigmoid') 3.19 0.0380 0.0346  
unary 1d(2187, 'tanh') 2.91 0.0403 0.0354  
unary 1d(59049, 'exp') 4.26 0.9219 0.5118  
unary 1d(59049, 'relu') 4.12 0.8622 0.5112  
unary 1d(59049, 'sigmoid') 4.81 0.9294 0.5123  
unary 1d(59049, 'tanh') 4.17 1.0564 0.5108  
unary 1d(177147, 'exp') 6.27 2.6684 1.5115  
unary 1d(177147, 'relu') 6.45 2.5438 1.4988  
unary 1d(177147, 'sigmoid') 6.39 2.7275 1.4965  
unary 1d(177147, 'tanh') 6.83 3.1460 1.4985
praeclarum commented 1 year ago

Thank you everyone!

Since switching the library to lazy evaluation, these number are now a bit old. I'm going to close this thread, feel free to post new numbers!