UcasRichard / snakeyaml

Automatically exported from code.google.com/p/snakeyaml
Apache License 2.0
0 stars 0 forks source link

merge should favor last key in map #139

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
scala> import org.yaml.snakeyaml.Yaml
import org.yaml.snakeyaml.Yaml

scala> val yaml = """
     | _common: &common
     |   key:
     |     value: 1
     |   key:
     |     value: 2
     | 
     | production:
     |   <<: *common
     | """

scala> (new Yaml).load(yaml)
res0: java.lang.Object = {_common={key={value=2}}, production={key={value=1}}}

I expect:
scala> (new Yaml).load(yaml)
res0: java.lang.Object = {_common={key={value=2}}, production={key={value=2}}}

That is: production.key should be 2 not 1 - the second value overrides the 
first.

Using SnakeYAML 1.9, Java 1.6.0_29, Scala 2.8.

FWIW, here's what Ruby's YAML does (1.8.7, 1.9.2):
>> require 'yaml'

>> yaml = "
_common: &common
  key:
    value: 1
  key:
    value: 2

production:
  <<: *common
"

>> YAML.load(yaml)
=> {"_common"=>{"key"=>{"value"=>2}}, "production"=>{"key"=>{"value"=>2}}}

Original issue reported on code.google.com by davidahe...@gmail.com on 23 Jan 2012 at 1:38

GoogleCodeExporter commented 9 years ago
the spec is not very clear: http://yaml.org/type/merge.html
Read: "unless the key already exists in it."

A few tests were added to identify the problem: 
http://code.google.com/p/snakeyaml/source/browse/src/test/java/org/yaml/snakeyam
l/issues/issue139/MergeValueTest.java

In general, I think the topic is not clearly specified. Can you please point us 
to the place in the YAML 1.1 specification where is says that if the keys in 
the mapping are not unique the latter key must override the former one?
I am afraid this is a common understanding (also implemented in SnakeYAML), but 
it is not specified.

Original comment by py4fun@gmail.com on 23 Jan 2012 at 10:15

GoogleCodeExporter commented 9 years ago
As long as I agree with the $subj and it seems close to  "common sene", spec is 
clear enough, I think. Especially about mapping (http://yaml.org/type/map.html):

  Associative container, where each *key is unique in the association* and mapped to exactly one value. 
  YAML places no restrictions on the type of keys; in particular, they are not restricted to being scalars.
  Example bindings include Perl’s hash, Python’s dictionary, and Java’s Hashtable.

Maybe MappingNode implementation should follow the same thing - duplicate keys 
are filtered when MappingNode is filled (now it is a List<NodeTuple>)

Original comment by alexande...@gmail.com on 23 Jan 2012 at 10:52

GoogleCodeExporter commented 9 years ago
I have asked more info in the YAML mailing list:
http://sourceforge.net/mailarchive/forum.php?thread_name=CAEO54bEsria59a%3D24N28
HnDaS2tfG6twRMe2WPPBWL3_0W9G0g%40mail.gmail.com&forum_name=yaml-core

Feel free to add your comments.

Original comment by py4fun@gmail.com on 24 Jan 2012 at 1:39

GoogleCodeExporter commented 9 years ago
As far as the spec intent goes, the described behavior is wrong on several 
levels. First, it should report the duplicate key as an error (I know, it isn't 
the only implementation to do that). Second, having decided not to do that, it 
should give the key the same value in both mappings. I can say that with some 
authority as the spec author :-)

As far of whether the letter of the spec also requires this, then (as quoted 
above) the spec in  http://yaml.org/type/merge.html says, about the merge key:

"If the value associated with the key is a single mapping node, each
of its key/value pairs is inserted into the current mapping,
unless the key already exists in it. "

Since it is talking about a mapping *node* (and not about YAML *text*), then 
(as quoted above) the following applies to it: "Associative container, where 
each key is unique in the association and mapped to exactly one value". That 
is, the mapping node being merged has, *by definition*, only one value 
associated with each key.

According to the code fragment that says:

res0: java.lang.Object = {_common={key={value=2}}, production={key={value=1}}}

Then the mapping node being merged (which is the value associated with the 
"_common" key) ended up associating "{value=2}" with the key. Therefore, the 
merge key has no choice but to merge the same value into the "production" 
mapping.

Put another way, the merge key is explicitly defined in terms of the "node" - 
actual data-that-was-already-parsed-and-loaded-and-is-now-being-merged rather 
than working in terms of the "text" - 
reparsing-the-original-text-of-the-merged-entry-and-providing-different-treatmen
t-for-duplicate-keys.

Hope this helps,

    Oren Ben-Kiki

Original comment by o...@ben-kiki.org on 24 Jan 2012 at 2:48

GoogleCodeExporter commented 9 years ago
Fixed. (Thanks to Alex for the suggested solution.)
Try the latest snapshot: 
https://oss.sonatype.org/content/groups/public/org/yaml/snakeyaml/1.10-SNAPSHOT/

Please let us know if it works for you. The fix will be delivered in version 
1.10 (planned for February 2012)

Original comment by py4fun@gmail.com on 24 Jan 2012 at 5:46

GoogleCodeExporter commented 9 years ago

Original comment by py4fun@gmail.com on 26 Jan 2012 at 11:07