JuliaStats / StatsFuns.jl

Mathematical functions related to statistics.
Other
235 stars 40 forks source link

logsumexp for streaming data #91

Closed cossio closed 4 years ago

cossio commented 4 years ago

Consider adding a version of logsumexp for streaming data. I found this here:

http://www.nowozin.net/sebastian/blog/streaming-log-sum-exp-computation.html

which includes the following Julia code:

function logsumexp_stream(X)
    alpha = -Inf
    r = 0.0
    for x = X
        if x <= alpha
            r += exp(x - alpha)
        else
            r *= exp(alpha - x)
            r += 1.0
            alpha = x
        end
    end
    log(r) + alpha
end

It has the advantage that a single pass over the data is required.

Pinging @tpapp in case this should go into https://github.com/tpapp/LogExpFunctions.jl.

I can make a PR myself if people agree into adding this.

nalimilan commented 4 years ago

Interesting. Have you benchmarked the two approaches?

cossio commented 4 years ago

@nalimilan The original blog post has some benchmarks:

n = 10_000_000
X = 500.0*randn(n)

julia> @btime logsumexp_batch(X)
92.695 ms (3 allocations: 76.29 MiB)
2600.631695542992

julia> @btime log(sum(exp.(X)))
136.538 ms (6 allocations: 76.29 MiB)
Inf

julia> @btime logsumexp_stream(X)
29.611 ms (1 allocation: 16 bytes)
2600.631695542992