Closed naitoh closed 6 years ago
wow! I do not understand anything about the code. But the performance results look great.
I have made additional improvements by c61b5e7.
$ ruby broadcast_fp32_0.rb
user system total real
x.inplace + y 6.472457 0.008103 6.480560 ( 6.488018)
x.inplace + 1.0 3.067979 0.002201 3.070180 ( 3.071858)
x.inplace + z 3.690922 0.005546 3.696468 ( 3.697989)
x.inplace - y 6.192707 0.004674 6.197381 ( 6.199437)
x.inplace - 1.0 3.379623 0.003397 3.383020 ( 3.385782)
x.inplace - z 3.495210 0.004582 3.499792 ( 3.502072)
x.inplace * y 6.091861 0.013382 6.105243 ( 6.115937)
x.inplace * 1.0 3.299177 0.002403 3.301580 ( 3.331252)
x.inplace * z 3.556242 0.004560 3.560802 ( 3.564021)
x.inplace / y 6.605212 0.004621 6.609833 ( 6.613972)
x.inplace / 1.0 5.443426 0.005829 5.449255 ( 5.451560)
x.inplace / z 5.416341 0.009099 5.425440 ( 5.430727)
Thank you so much.
@naitoh awesome
Thank you for the merge.
I wrote a patch to improve the performance of the broadcast. (See #93 for details.)
Benchmarked code
numo-narray (0.9.1.2)
numo-narray (this pull request)
Environment