JuliaData / YAML.jl

Parse yer YAMLs
Other
129 stars 45 forks source link

Duplicate keys in mapping #140

Open sstroemer opened 4 months ago

sstroemer commented 4 months ago

As of the YAML specs, keys of mappings need to be unique: "The content of a mapping node is an unordered set of key/value node pairs, with the restriction that each of the keys is unique."

This is not accounted for, which leads to duplicate entries silently being overwritten:

YAML.load("x: 3")         # Dict{Any, Any} with 1 entry: "x" => 3
YAML.load("x: 3\nx: 4")   # Dict{Any, Any} with 1 entry: "x" => 4     (<-- this is wrong)

A bandaid fix (if anyone stumbles on this) would be:

@kwdef struct UniqueKeyDict{T1, T2} <: AbstractDict{T1, T2}
    dict::Dict{T1, T2} = Dict{T1, T2}()
end

function Base.setindex!(ukd::UniqueKeyDict, value, key)
    haskey(ukd.dict, key) && error("Key $key already exists in dictionary.")
    ukd.dict[key] = value
end

YAML.load("x: 3")                                               # Dict{Any, Any} with 1 entry: "x" => 3
YAML.load("x: 3\nx: 4")                                         # Dict{Any, Any} with 1 entry: "x" => 4
YAML.load("x: 3"; dicttype=UniqueKeyDict{Any, Any}).dict        # Dict{Any, Any} with 1 entry: "x" => 3
YAML.load("x: 3\nx: 4"; dicttype=UniqueKeyDict{Any, Any}).dict  # throws an error

Note that the internal dict field can then be used for further processing. Similar workarounds are of course valid for OrderedDict or similar types.

kescobo commented 4 months ago

Using insert! from Dictionaries.jl is how I would do it (though to be clear, I'm not advocating adding that dependency)