geofffranks / spruce

A BOSH template merge tool
MIT License
433 stars 78 forks source link

The result of merging depends on order data in a list #357

Open kazh000 opened 2 years ago

kazh000 commented 2 years ago

When I merge follow two yaml files I get the expected result:

---
a:
- c: 1
  z:
  - x: 20
    "y": 25
- c: 2
  z:
  - x: 2
- [3, 4]
b: 30
---
a:
- c: 1
  z:
  - x: 20
    "y": 30
- c: 2
  z:
  - x: 3

The result:

a:
- c: 1
  z:
  - x: 20
    "y": 30
- c: 2
  z:
  - x: 3
- - 3
  - 4
b: 30

But when I change order in the first file I get the unexpected result:

---
a:
- c: 1
  z:
  - x: 20
    "y": 25
- [3, 4]
- c: 2
  z:
  - x: 2
b: 30
---
a:
- c: 1
  z:
  - x: 20
    "y": 30
- c: 2
  z:
  - x: 3

The result:

a:
- c: 1
  z:
  - x: 20
    "y": 30
- c: 2
  z:
  - x: 3
- c: 2
  z:
  - x: 2
b: 30
sorenisanerd commented 2 years ago

With no explicit operator, spruce does an (( inline )) merge of arrays. You can read about it and its perils when dealing with arrays of maps here:

https://github.com/geofffranks/spruce/blob/master/doc/array-merging.md#operators-that-work-the-same-for-either-type-of-array

Briefly, a (an array) gets merged by merging each of its elements with the same (by index) element from the second document. This works fine for the first example, because the first element of a in the first doc is a map that merges well with the first element of a in the second doc. Same for the second element. Only one document has a third element in the a array, so the merging is straight forward.

In your second example, spruce actually attempts to merge [3, 4] with:

- c: 2
  z:
  - x: 3

which, frankly, I'm not sure what should do.

dikderoy commented 1 year ago

on that topic - the default strategy for merging arrays is very subjective.

spruce provides a --fallback-append option, but it would have been better to allow picking more strategies as default instead of "merge if fail then inline".

I would've preferred to have

--array-merge-strategy=[merge-inline|inline|prepend|append|replace]

as a flag that I could pass to specify default behavior explicitly (if no array-merge operator is present).

For context -> I use spruce to transform Helm configs. Array merge in Spruce is different from how Helm does it with multiple -f flags (helm uses replace effectively). If I could've specified that, it would've provided clearer behavior for my users (who are oblivious to spruce existence and only write YAML documents and delta-changes to them). instead, I have to tell them to always specify the ((replace)) operator always, which also makes spruce non-transparent in the pipeline (one cannot use those files w/o using spruce to process them then).