andyferris / Dictionaries.jl

An alternative interface for dictionaries in Julia, for improved productivity and performance
Other
278 stars 28 forks source link

Add `setwith!` #117

Closed theogf closed 5 months ago

theogf commented 1 year ago

This PR proposes a new feature which is an equivalent of mergewith! with a Dictionary containing only one element. This is much faster than building a Dictionary every time. I did not use token as the get approach was faster. Here are some benchmarks:

julia> using BenchmarkTools, Dictionaries
julia> function tokensetwith!(f, d::AbstractDictionary{I}, i, value) where {I}
    i2 = safe_convert(I, i)
    hastoken, token = gettoken!(d, i2)
    hastoken ? settokenvalue!(d, token, f(gettokenvalue(d, token), value)) : settokenvalue!(d, token, value) 
    return d
end
julia> @btime mergewith!(+, d, Dictionary(Ref(2), Ref(2.0))) setup=(d=Dictionary(1:5, rand(5)))
  131.679 ns (6 allocations: 448 bytes)
julia> @btime setwith!(+, d, 2, 2.0) setup=(d=Dictionary(1:5, rand(5)))
  7.993 ns (0 allocations: 0 bytes)
julia> @btime tokensetwith!(+, d, 2, 2.0) setup=(d=Dictionary(1:5, rand(5)))
  8.785 ns (0 allocations: 0 bytes)

Additionaly I started to use safe_convert instead of converting and equality checking to avoid code duplication,

theogf commented 1 year ago

Ping @andyferris :)

andyferris commented 5 months ago

One more thing I'm thinking of here, is what to do in the case you want to insert a new mutable object in the "default" case, like an empty vector that you will push! to later? Or what if you want a random number, or want to grab the value from an external database?

For get! we have a form where the first argument is a zero-argument function that gets called in order to construct the value. I'm wondering if that is the pattern we should use here - f has two methods, one with ~zero~ one argument and another with ~one argument~ two arguments. (Or we take two functions. Or we keep the existing setwith! method and add a second one. Or we add a method to setwith! that accepts a Pair.) What do you think?

andyferris commented 5 months ago

So I guess I wonder if something like this would work ok?

function setwith!(f, d::AbstractDictionary{I}, i, value) where {I}
    i2 = safe_convert(I, i)
    (had_token, token) = gettoken!(d, i2)
    settoken!(d, token, had_token ? f(gettokenvalue(d, token), value) : f(value))
    return d
end

(Also, note the usage of gettoken! and gettokenvalue and settoken! here, which is faster and gets around needing a nothing sentinel)

andyferris commented 5 months ago

Hmm, I see now that the original mergewith! only supports simple reductions (not general folds). I also note that the "mutation" case is already served well by mutating the result of get!.

So probably the remaining work is to use gettoken! etc here.

codecov[bot] commented 5 months ago

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (0598d92) 80.12% compared to head (7f7c65b) 80.40%.

Files Patch % Lines
src/insertion.jl 89.47% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #117 +/- ## ========================================== + Coverage 80.12% 80.40% +0.28% ========================================== Files 20 20 Lines 2360 2358 -2 ========================================== + Hits 1891 1896 +5 + Misses 469 462 -7 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

andyferris commented 5 months ago

And just to check the benchmark at the top:

julia> using BenchmarkTools, Dictionaries

julia> function tokensetwith!(f, d::AbstractDictionary{I}, i, value) where {I}
           i2 = Dictionaries.safe_convert(I, i)
           hastoken, token = gettoken!(d, i2)
           @inbounds(hastoken ? settokenvalue!(d, token, f(gettokenvalue(d, token), value)) : settokenvalue!(d, token, value))
           return d
       end
tokensetwith! (generic function with 1 method)

julia> @btime mergewith!(+, d, Dictionary(Ref(2), Ref(2.0))) setup=(d=Dictionary(1:5, rand(5)))
  122.996 ns (6 allocations: 448 bytes)
5-element Dictionary{Int64, Float64}
 1 │ 0.6762046077805732
 2 │ 1796.0759530049804
 3 │ 0.15434878980223343
 4 │ 0.27622327088960563
 5 │ 0.6259347665546298

julia> @btime setwith!(+, d, 2, 2.0) setup=(d=Dictionary(1:5, rand(5)))
  8.033 ns (0 allocations: 0 bytes)
5-element Dictionary{Int64, Float64}
 1 │ 0.6825175785534227
 2 │ 1998.4092353771443
 3 │ 0.9883831681856018
 4 │ 0.8871562355778311
 5 │ 0.2561037264701246

julia> @btime tokensetwith!(+, d, 2, 2.0) setup=(d=Dictionary(1:5, rand(5)))
  8.039 ns (0 allocations: 0 bytes)
5-element Dictionary{Int64, Float64}
 1 │ 0.6663394005518313
 2 │ 1998.6949422352689
 3 │ 0.48414147097987303
 4 │ 0.32928579879421094
 5 │ 0.26274353692197994
andyferris commented 5 months ago

Thanks again @theogf - only took me a year...