JuliaCollections / DataStructures.jl

Julia implementation of Data structures
https://juliacollections.github.io/DataStructures.jl/latest/
MIT License
679 stars 242 forks source link

DefaultDict accessor fails when the default is not the value type #901

Open tpapp opened 3 months ago

tpapp commented 3 months ago

MWE:

pkg> st DataStructures
  [864edb3b] DataStructures v0.18.18

julia> using DataStructures

julia> dd = DefaultDict(1, :a => (2,))
DefaultDict{Symbol, Tuple{Int64}, Int64} with 1 entry:
  :a => (2,)

julia> dd[:b]
ERROR: MethodError: Cannot `convert` an object of type Int64 to an object of type Tuple{Int64}

Closest candidates are:
  convert(::Type{T}, ::T) where T<:Tuple
   @ Base essentials.jl:451
  convert(::Type{T}, ::T) where T
   @ Base Base.jl:84
  convert(::Type{T}, ::CartesianIndex) where T<:Tuple
   @ Base multidimensional.jl:136
  ...

Stacktrace:
 [1] get!
   @ Base ./dict.jl:481 [inlined]
 [2] get!
   @ Base ./abstractdict.jl:552 [inlined]
 [3] getindex
   @ DataStructures ~/.julia/packages/DataStructures/jFDPC/src/default_dict.jl:63 [inlined]
 [4] getindex(a::DefaultDict{Symbol, Tuple{Int64}, Int64}, args::Symbol)
   @ DataStructures ~/.julia/packages/DataStructures/jFDPC/src/delegate.jl:21
 [5] top-level scope
   @ REPL[111]:1
tpapp commented 3 months ago

FWIW, I don't understand why it calls get!, and not get.

oxinabox commented 3 months ago

FWIW, I don't understand why it calls get!, and not get.

By calling get! and not get it means one can do things like:

dict_of_lists = DefaultDict(Vector)
for (key, item) in (:a=>1, :b=>1, :a=>10)
    push!(dict_of_lists[key], item)
end

Which is very useful -- in my experience this kinda pattern of a collection of collections is the most common use for a DefaultDict


I don't undestand there being heavy demand for if the default isn't the same type. It would mean getindex is type unstable, which is problematic both from a code speed POV, but also from like working out what operations are legal on it.

tpapp commented 3 months ago

What I missed is that whenever access falls back to the default, it is stored in the parent dict, as in

julia> using DataStructures

julia> dd = DefaultDict(1, :a => 2)
DefaultDict{Symbol, Int64, Int64} with 1 entry:
  :a => 2

julia> dd[:b]
1

julia> dd
DefaultDict{Symbol, Int64, Int64} with 2 entries:
  :a => 2
  :b => 1

I find this design somewhat surprising, as it just replicates get!.

From the docs it was my expectation that getindex just falls back to the default without changing the parent.

oxinabox commented 3 months ago

Yes, it replicates get! but with the ability to specify what to do at declaration time, rather than at access time. I personally use get! on a regular Dict far more often than wanting to use a different type. The alternative would be to replicate get, but as shown in my example above this cuts out a key use case. We should improve its documentation