mikefarah / yq

yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor
https://mikefarah.gitbook.io/yq/
MIT License
11.66k stars 579 forks source link

!!merge overwrites existing keys #2110

Open majewsky opened 1 month ago

majewsky commented 1 month ago

Describe the bug

Upon switching from python-yq to go-yq, I discovered that some existing YAML files are interpreted differently. Specifically, the << merge key is handled differently by go-yq from what the spec suggests. The spec says:

If the value associated with the key is a single mapping node, each of its key/value pairs is inserted into the current mapping, unless the key already exists in it.

As shown in the example below (specifically, in the object called "ellipse"), go-yq appears to overwrite existing keys during merging. The object called "egg" demonstrates a workaround, in which the merge key is placed before all other keys to avoid this bug.

Version of yq: 4.44.2 Operating system: Arch Linux Installed via: System package (pacman)

Input Yaml

objects:
  - &circle
    name: circle
    shape: round
  - name: ellipse
    !!merge <<: *circle
  - !!merge <<: *circle
    name: egg

Command The command you ran:

yq -o json input.yaml

Actual behavior

{
  "objects": [
    {
      "name": "circle",
      "shape": "round"
    },
    {
      "name": "circle",
      "shape": "round"
    },
    {
      "shape": "round",
      "name": "egg"
    }
  ]
}

Expected behavior

{
  "objects": [
    {
      "name": "circle",
      "shape": "round"
    },
    {
      "name": "ellipse",
      "shape": "round"
    },
    {
      "shape": "round",
      "name": "egg"
    }
  ]
}
mikefarah commented 1 month ago

Ooh interesting 🤔 I don't know how I missed that. It's a pity they didn't have an example like yours, where the merge key is after the key values - I didn't think of it 😮‍💨

I'm a little worried that if I change this behavior now it would break a bunch of data pipelines for current users - I don't think this bug can be safely fixed.

I think that this is probably one of the reasons the merge key has been removed from the 1.2 spec :/ without reading the spec super clearly it's easy to have different expectations on how it would work :(

wenhoujx commented 1 day ago

just ran into this issue, and debugged for hours. The spec's behavior is a little weird. I applaud the concern that fixing it might break many data pipeline. but it would be good to have a flagged behavior to match the yaml spec. Without a yq that behaves like all the other yaml toolings, i can't use or trust yq in a work environment where we use anchors extensively.