josephburnett / jd

JSON diff and patch
MIT License
1.62k stars 48 forks source link

Diff arrays #10

Closed antonmedv closed 5 years ago

antonmedv commented 5 years ago

What about deleting from beginning of array?

[1,2,3,4,5,6,7,8,9]
[2,3,4,5,6,7,8,9]

Why all diff is like everything was edited?

drjonnicholson commented 5 years ago

Surely that's right as a diff, since the order has changed (e.g. 1 was removed and 2 was added in it's place)

If the order doesn't matter you could try the -set option?

antonmedv commented 5 years ago

What is difference b/ set and multiset?

drjonnicholson commented 5 years ago

So in mathematics, each item in a set (-set) must only occur once, whereas a bag/multiset (-mset) allows for the same item to appear multiple times.

So your example of [1,2,3,4,5,6,7,8,9], when treated as a set is equivalent to [9,8,7,6,5,4,3,2,1]. So the difference between [1,2,3,4,5,6,7,8,9] and [2,3,4,5,6,7,8,9] when treated as sets is 1 being removed.

Quick examples,, let's say I have a.json containing [1,1,2], and b.json containing [2,1], here is the output of a few different options:

jd 'a.json' 'b.json'
@ [0]
- 1
+ 2
@ [2]
- 2

jd -set 'a.json' 'b.json'
# No diff, because when treated as sets [1,1,2] is equivalent to [1,2], which is equivalent to [2,1]

jd -mset 'a.json' 'b.json'
@ [{}]
- 1

Hope that helps

josephburnett commented 5 years ago

Thanks @drjonnicholson for the explanation.

jd could be smarter about producing a higher level diff when all the elements are shifted. For example, instead of:

@ [4]
- 5
@ [3]
- 4
+ 5
@ [2]
- 3
+ 4
@ [1]
- 2
+ 3
@ [0]
- 1
+ 2

we could produce:

@ []
- [1,2,3,4,5]
+ [2,3,4,5]