avian2 / jsonmerge

Merge a series of JSON documents.
MIT License
214 stars 25 forks source link

Feature request : unique merge strategy for lists #24

Closed jggc closed 8 years ago

jggc commented 8 years ago

I have a use case similar to this one :

base = {
    "foo" : [1,2,3,999]
}

head = {
    "foo" : [2,4,5,999]
}

And I'd like a result like this :

{
    "foo" : [1,2,3,4,5,999]
}

But instead with the default strategy I get

{
    "foo": [2, 4, 5, 999]
}

And with append I get

{
    'foo': [1, 2, 3, 999, 2, 4, 5, 999]
}

Which is the expected behavior in both cases, but doesn't fit my needs.

That kind of merge can be done in python like that :

> list1 = [1,2,3,999]
> list2 = [2,4,5,999]
> listMerge = list(set(list1 + list2))
> listMerge
[1, 2, 3, 4, 5, 999]

I guess the conversion to a set and then bat to a list is expensive but for my use case where I have under 10 elements in the merged list there is no visible performance cost.

If you think it's a good idea I will gladly implement it and do a pull request.

avian2 commented 8 years ago

Hi

I think your use case is already covered in a more general way by the arrayMergeById strategy. You can set the idRef option to point to the array item itself (instead of some object property lower down) and it will work fine with just an array of integers.

For example, this code:

import jsonmerge
import pprint

base = {
    "foo" : [1,2,3,999]
}

head = {
    "foo" : [2,4,5,999]
}

schema = {
    "properties": {
        "foo": {
            "mergeStrategy": "arrayMergeById",
            "mergeOptions": {"idRef": "/"},
        }
    }
}

merger = jsonmerge.Merger(schema)
base = merger.merge(base, head)

pprint.pprint(base)

Prints out:

{'foo': [1, 2, 3, 999, 4, 5]}

Is this what you were looking for?

jggc commented 8 years ago

Well, for my use case yes it totally works! I didn't think of using / as an idRef.

But maybe for some other people the sorting would be useful? And looking at the implementation of arrayMergeById I think that performance is better with my solution for this specific use case but I'm not really sure.

But these considerations are out of my scope, I let you judge what you want to do with this feature request, I can still implement it if you think it's worth it. If not, just close this issue.

Thanks for your help.

avian2 commented 8 years ago

I agree it's not obvious that / works. I'll add a note to README to point that out.

Current README doesn't define how array items are ordered when using arrayMergeById. With current code, new items are always appended in the end. Maybe this should be documented as well.

Yes, performance of arrayMergeById probably isn't great. But making a specific version of it optimized for integers and strings feels like a premature optimization. As you say, you don't have a problem with it. I don't think it's worth adding extra code that needs to be maintained.

jggc commented 8 years ago

Yeah I agree it's probably premature to add this functionnality. Closing this.