wmde / Diff

䷂ Library for diffing, patching and representing differences between objects
BSD 3-Clause "New" or "Revised" License
200 stars 15 forks source link

Recursive MapDiffer #98

Open BoShurik opened 6 years ago

BoShurik commented 6 years ago

Hello! I have following example:

$value1 = [
    'name' => 'Name',
    'value' => 30000,
    'clients' => [
        [
            'name' => 'Client name 1',
            'documents' => [
                [
                    'name' => 'Document name 1',
                    'attributes' => [
                        [
                            'name' => 'Attribute 1',
                        ],
                    ],
                ],
            ],
        ],
    ],
];

$value2 = $value1;
$value2['name'] = 'New name';
$value2['clients'][0]['name'] = 'Client name 1!';
$value2['clients'][0]['documents'][0]['name'] = 'Document name 1!';
$value2['clients'][0]['documents'][0]['attributes'][0]['name'] = 'Attribute 1!';

To make it work I either should know depth

$differ = new \Diff\Differ\MapDiffer(
    true,
    new \Diff\Differ\MapDiffer(
        true,
        new \Diff\Differ\MapDiffer(/** ... */)
    )
);

or use reflection to instantiate $differ

$differ = new \Diff\Differ\MapDiffer(true);
$reflectionClass = new ReflectionClass(\Diff\Differ\MapDiffer::class);
$reflectionProperty = $reflectionClass->getProperty('listDiffer');
$reflectionProperty->setAccessible(true);
$reflectionProperty->setValue($differ, $differ);

Looks like we need to add a setter for $listDiffer or change constructor to

public function __construct( bool $recursively = false, Differ $listDiffer = null, ValueComparer $comparer = null ) {
    $this->recursively = $recursively;
    $this->listDiffer = $listDiffer ?? $this;
    $this->valueComparer = $comparer ?? new StrictComparer();
}
BoShurik commented 6 years ago

I can make a PR, but I need to know what is appropriate way to solve the problem

JeroenDeDauw commented 6 years ago

Thanks for reporting this issue. What version of Diff are you using?

It seems like you are right and that at the moment you can indeed not recursively diff a structure with both lists and maps at different levels.

The first workaround you describe looks like it is going against the intent of the code. You are passing in a MapDiffer where a $listDiffer is expected. Not entirely sure since this is a long time since I wrote that, and the difference between handling of maps and lists does not seem well designed.

Modifying the constructor as you suggested might work for your use case but does alter the default behavior for other usecases, so is not viable. Adding the setter could work, though this would then likely go against the original intent and introduce mutability of used services, which is also not nice design wise. I'll have a closer look at this soonish and see if I can come up with a third option. If you have any ideas, please share them.

JeroenDeDauw commented 6 years ago

Now had a closer look. You are trying to do something that Diff was not really designed to do.

Diff compares "maps" (arrays with keys) by key, meaning that elements in maps can change. In case of "lists" (arrays with only numeric keys), only the values are compared, meaning you can only have additions and removals. In your case you have "lists" that you want to have treated as maps.

One thing that you can do is turn your "lists" into "maps". You could cast all the integer keys into strings. Then when you use MapDiffer, you will get the behavior you expect out of the box.

JeroenDeDauw commented 6 years ago

Not obvious how to make MapDiffer do what you want without changing your input or making bigger changes. One thing that could be done that does not change the default behavior or introduce mutability is adding a static construction method like this one:

public static function newRecursiveMapDiffer( ValueComparer $comparer = null ): self {
    $differ = new self( true, null, $comparer );
    $differ->listDiffer = $differ;
    return $differ;
}

The problem with that (and with the other 2 approaches) is that this line ends up being wrong:

image

Since you now have an associative diff, and telling the Diff class it is not. Apparently there is no sanity check for that at the moment that throws an exception, though if things are refactored in the future, this could definitely break.

Is it the case you really do not know the structure of the data you are diffing and know that doing an associative diff for lists is always correct? (That is another different in usecase: Diff was designed for cases where the structure is known.)

thiemowmde commented 6 years ago

you can only have additions and removals.

@BoShurik, you might want to have a look at #65, which I believe tackles parts of the exact problem you run into.

BoShurik commented 6 years ago

What version of Diff are you using?

The latest version (3.1.0)

One thing that you can do is turn your "lists" into "maps". You could cast all the integer keys into strings. Then when you use MapDiffer, you will get the behavior you expect out of the box.

Yes, it works, but for data like

$value1 = [
    'name' => 'Name',
    'value' => 30000,
    'clients' => [
        'item0' => [
            'name' => 'Client name 1',
            'documents' => [
                'item0' => [
                    'name' => 'Document name 1',
                    'attributes' => [
                        'item0' => [
                            'name' => 'Attribute 1',
                        ],
                    ],
                ],
            ],
            'tags' => ['tag1', 'tag2', 'tag3'],
        ],
    ],
];

As '0' converted to 0

The problem with that (and with the other 2 approaches) is that this line ends up being wrong

return new Diff( $this->listDiffer->doDiff( $old, $new ), $this->listDiffer instanceof MapDiffer);

This may help, but it looks very hacky

JeroenDeDauw commented 6 years ago

Indeed, that is hacky. When I was looking at that I was disappointed to find that there is no way to tell if you are dealing with a "map differ" or not. The instanceof check is not only hacky, it will also fail when another Differ is used that is a "map differ". I opened an issue about this and related problems as https://github.com/wmde/Diff/issues/100

As '0' converted to 0

Huh? Where this this conversion happen? Not in this library I hope? I think simply casting the array keys to strings should work. This should be treated as a map:

[
    '0' => 'foo',
    '1' => 'bar',
    '2' => 'baz',
]

All that is needed is a SINGLE non-integer key. So this will also do the trick:

[
    'foo',
    'bar',
    'baz',
    'hack' => true
]
BoShurik commented 6 years ago

Huh? Where this this conversion happen? Not in this library I hope? I think simply casting the array keys to strings should work.

It's a php issue

Strings containing valid decimal integers, unless the number is preceded by a + sign, will be cast to the integer type. E.g. the key "8" will actually be stored under 8. On the other hand "08" will not be cast, as it isn't a valid decimal integer.

JeroenDeDauw commented 6 years ago

omg php >_> Just verified that with https://3v4l.org/RvE3K