Closed christopherzimmerman closed 4 years ago
@wontruefree if you look at what was previously the arrayops folder, I had several instances of macros that did basically the entire thing, and also lots of code to broadcast tensors against each other, broadcast scalars, upcast scalars, so I finally got around to implementing it in a central location.
Also you'll notice lots of the benchmarks are much faster, a factor of almost 5x for reductions and accumulations in some cases.
awesome I was just curious.
what was this rewrite for?