Closed mkschulze closed 3 years ago
The idea of distict indices is pretty ingrained into the AbstractDictionary
interface here, so this would require some thought.
I'm interested in what case people use a multidict for? I do frequently use a dictionary of vectors (or other collection), for example like this:
dict = Dictionary{I, Vector{T}}()
for x in input
index = f(x)
value = g(x)
push!(get!(Vector{T}, dict, index), value)
end
(This is essentially the SplitApplyCombine.group(f, g, dict)
method).
What do you use a multidict for? How are they created? How are they consumed? (I'm honestly asking because I've never used one in anger).
What I have in mind is something like Werkzeugs MultiDict in Python: https://werkzeug.palletsprojects.com/en/1.0.x/datastructures/#werkzeug.datastructures.MultiDict
>>> d = MultiDict([('a', 'b'), ('a', 'c')])
>>> d
MultiDict([('a', 'b'), ('a', 'c')])
>>> d['a']
'b'
>>> d.getlist('a')
['b', 'c']
>>> 'a' in d
True
Maybe it could be done with value collection:
+ get!(dict, key, empty_collection)
There is also a MultiDict in DataStructures: https://github.com/JuliaCollections/DataStructures.jl/blob/master/src/multi_dict.jl
From https://pypi.org/project/multidict/
HTTP Headers and URL query string require specific data structure: multidict. It behaves mostly like a regular dict but it may have several values for the same key and preserves insertion ordering.
Right. This structure is basically a vector of Pair
s, since the keys aren't unique. I suspect Vector{Pair{String, String}}
would be a reasonable choice for headers and URL query strings, given their typical size limits.
In other cases these kinds of vectors get long and you want to find all the instances of a certain key. You could just run the vector of Pair
s through through group(first, last, vector)
.
In other cases I've had to do this kind of lookup efficiently before (via a hash map, not a scan), I'm usually thinking of tabular data, with a (non-unique) key column and a value column. In a SQL database you'd add a secondary index. In Julia I use HashIndex
from AcceleratedArrays.
Haha, simultaneous posting.
Maybe we just need a shorthand for push!(get!(Vector{T}, dict, index), value)
?
I'm a beginner, so my technical creativity is limited yet. But I try to rebuild a small CMS like project I know from Python into Julia. This is the code I try to translate:
https://gist.github.com/mkschulze/9c623d4e885ace6f71452dcbff8b8ab0
Maybe we just need a shorthand for
push!(get!(Vector{T}, dict, index), value)
?
Like so, maybe:
function pushgroup!(d::AbstractDictionary, key, value)
T = eltype(d)
if T = Any
T = Vector{Any}
end
push!(get!(T, d, key), value)
return d
end
This could potentially even be exported by SplitApplyCombine
...
This is the code I try to translate:
OK interesting. In that case, the code is just flattening the multidict into a normal dict, ignoring all but the first value.
For this, you could use dictionary
. Like this:
julia> dictionary([1 => 1, 1 => 2, 2 => 3])
2-element Dictionary{Int64,Int64}
1 │ 1
2 │ 3
The dictionary
function should work on any iterable of pairs (or two-tuples, two-vectors, etc).
Is that helpful?
I think it is yes! Thank you.
Just leaving this request here to raise some awareness of multidics and for a possible consideration. Thx