I ran into a use case where I needed a weighted sum in the logsumexp function when copying some python code.
There are no performance regressions in the new version, I also removed the TODO regarding some special-casing on the value u because without it the performance is much worse (30 ns -> 1 us) for cases where dims=:.
Here are the benchmarks I did to make sure there's still a super-fast no allocation for the simplest case
I ran into a use case where I needed a weighted sum in the
logsumexp
function when copying some python code.There are no performance regressions in the new version, I also removed the TODO regarding some special-casing on the value
u
because without it the performance is much worse (30 ns -> 1 us) for cases wheredims=:
.Here are the benchmarks I did to make sure there's still a super-fast no allocation for the simplest case