rawrgrr / Levenshtein.jl

Levenshtein distance between two strings in julia
MIT License
14 stars 15 forks source link

Speed up algorithm by fixing type stability #5

Closed IainNZ closed 9 years ago

IainNZ commented 9 years ago

Before

IAINMAC:test idunning$ jl runtests.jl
elapsed time: 0.235095972 seconds (36611688 bytes allocated, 16.01% gc time)

After

IAINMAC:test idunning$ jl runtests.jl
elapsed time: 0.013036577 seconds (9155016 bytes allocated)

For

srand(1234)
N = 1000
s = randString(ifloor(rand() * N))
t = randString(ifloor(rand() * N))
ins, del, sub = rand() * 1000, rand() * 1000, rand() * 1000
@time levenshtein(s, t, del, ins, sub)
rawrgrr commented 9 years ago

Woah! Super cool!

milktrader commented 9 years ago

@IainNZ is the speed up from parameterizing types or is it mostly from pre-allocating? And what do you mean by type stability?

IainNZ commented 9 years ago

Well not exactly type stability. Before the cost vectors were stored as Vector{Real}, which is an abstract type, so all the values were boxed. By using the parametric types, we can figure out a tighter type. So e.g. if you passed in a mix of Float64s and Ints it'll use promote_type to identify that we can use a Vector of Float64s.

milktrader commented 9 years ago

Ah, I see. Thanks.