DoneDeal0 / superdiff

Superdiff compares two arrays or objects and returns a full diff of their differences in a readable format.
https://www.npmjs.com/package/@donedeal0/superdiff
688 stars 6 forks source link

Marking a property as "unique id" so modified items in array are considered `updated` and not `deleted+added` #14

Closed sneko closed 9 months ago

sneko commented 9 months ago

Hi @DoneDeal0 ,

I'm trying to compare arrays of objects but I would like the status updated to be triggered when some objects are modified. But for this to be, the library should define a property "stable" in the "before array" and into the "after array".

The current state of the library:

    const before = [
      { id: 1, myProp: 1 },
      { id: 2, myProp: 2 },
      { id: 3, myProp: 3 },
    ];
    const after = [
      { id: 2, myProp: 222 },
      { id: 3, myProp: 3 },
      { id: 4, myProp: 4 },
    ];

    const diffResult = getListDiff(before, after);

would produce:

    + Object {
    +   "diff": Array [
    +     Object {
    +       "indexDiff": null,
    +       "newIndex": null,
    +       "prevIndex": 0,
    +       "status": "deleted",
    +       "value": Object {
    +         "id": 1,
    +         "myProp": 1,
    +       },
    +     },
    +     Object {
    +       "indexDiff": null,
    +       "newIndex": null,
    +       "prevIndex": 1,
    +       "status": "deleted",
    +       "value": Object {
    +         "id": 2,
    +         "myProp": 2,
    +       },
    +     },
    +     Object {
    +       "indexDiff": null,
    +       "newIndex": 0,
    +       "prevIndex": null,
    +       "status": "added",
    +       "value": Object {
    +         "id": 2,
    +         "myProp": 222,
    +       },
    +     },
    +     Object {
    +       "indexDiff": -1,
    +       "newIndex": 1,
    +       "prevIndex": 2,
    +       "status": "moved",
    +       "value": Object {
    +         "id": 3,
    +         "myProp": 3,
    +       },
    +     },
    +     Object {
    +       "indexDiff": null,
    +       "newIndex": 2,
    +       "prevIndex": null,
    +       "status": "added",
    +       "value": Object {
    +         "id": 4,
    +         "myProp": 4,
    +       },
    +     },
    +   ],
    +   "status": "updated",
    +   "type": "list",
    + }

Whereas I expect the object with id: 2 to be updated, no deleted + added. Did I miss something and it's already possible?

A workaround I could try is to make a post-processor that looks at results, for similar id having both status removed and added, keep just one of the two and setting the status to updated. Which would work, and would avoid modifying your library.

Note aside: for other methods you implemented a ignoreArrayOrder. I think it could make sense to add it to getListDiff() to get equal instead of moved (example for id: 3 here), but same here, the developer can just take into account those 2 statuses.

Thank you,

sneko commented 9 months ago

To take in account my 2 needs I made a wrapper, this is not perfect because it just patches items and does not do more logic about newIndex/prevIndex.

In case someone is interested:

// comparaison.ts

import { getListDiff as libraryGetListDiff } from '@donedeal0/superdiff';

// This is a custom implementation to fix specific needs (ref: https://github.com/DoneDeal0/superdiff/issues/14)
export const getListDiff: typeof libraryGetListDiff = (...args) => {
  const results = libraryGetListDiff(...args);

  let deletedDiffItems = results.diff.filter((diffItem) => {
    return diffItem.status === 'deleted';
  });

  // Simulate `ignoreArrayOrder` as for some other of the library methods
  // Also infer `updated` status for items
  let tmpDiffItems: typeof deletedDiffItems = [];
  for (const diffItem of results.diff) {
    if (diffItem.status === 'moved') {
      diffItem.status = 'equal';
    } else if (diffItem.status === 'added') {
      // If also deleted we change it to `updated` while removing it from the final `deleted` list
      const correspondingDeletedItemIndex = deletedDiffItems.findIndex((deletedDiffItem) => {
        return !!deletedDiffItem.value.id && deletedDiffItem.value.id === diffItem.value.id;
      });

      if (correspondingDeletedItemIndex !== -1) {
        diffItem.status = 'updated';

        deletedDiffItems.splice(correspondingDeletedItemIndex, 1);
      }
    } else if (diffItem.status === 'deleted') {
      // We add remaining deleted items at the end
      continue;
    }

    tmpDiffItems.push(diffItem);
  }

  results.diff = [...tmpDiffItems, ...deletedDiffItems];

  return results;
};

and the associated tests:

// comparaison.spec.ts

/**
 * @jest-environment node
 */
import { getListDiff as libraryGetListDiff } from '@donedeal0/superdiff';

import { getListDiff } from './comparaison';

describe('getListDiff()', () => {
  it('should recognize same objects based on id in the list', async () => {
    const before = [
      { id: 1, myProp: 1 },
      { id: 2, myProp: 2 },
      { id: 3, myProp: 3 },
    ];
    const after = [
      { id: 2, myProp: 222 },
      { id: 3, myProp: 3 },
      { id: 4, myProp: 4 },
    ];

    const libraryDiffResult = libraryGetListDiff(before, after);

    // The library will by default not recognized `updated` objects due to not considering stable IDs
    expect(libraryDiffResult).toStrictEqual({
      diff: [
        {
          indexDiff: null,
          newIndex: null,
          prevIndex: 0,
          status: 'deleted',
          value: {
            id: 1,
            myProp: 1,
          },
        },
        {
          indexDiff: null,
          newIndex: null,
          prevIndex: 1,
          status: 'deleted',
          value: {
            id: 2,
            myProp: 2,
          },
        },
        {
          indexDiff: null,
          newIndex: 0,
          prevIndex: null,
          status: 'added',
          value: {
            id: 2,
            myProp: 222,
          },
        },
        {
          indexDiff: -1,
          newIndex: 1,
          prevIndex: 2,
          status: 'moved',
          value: {
            id: 3,
            myProp: 3,
          },
        },
        {
          indexDiff: null,
          newIndex: 2,
          prevIndex: null,
          status: 'added',
          value: {
            id: 4,
            myProp: 4,
          },
        },
      ],
      status: 'updated',
      type: 'list',
    });

    const diffResult = getListDiff(before, after);

    // Our own wrapper around the library function should fix those
    expect(diffResult).toStrictEqual({
      diff: [
        {
          indexDiff: null,
          newIndex: 0,
          prevIndex: null,
          status: 'updated',
          value: {
            id: 2,
            myProp: 222,
          },
        },
        {
          indexDiff: -1,
          newIndex: 1,
          prevIndex: 2,
          status: 'equal',
          value: {
            id: 3,
            myProp: 3,
          },
        },
        {
          indexDiff: null,
          newIndex: 2,
          prevIndex: null,
          status: 'added',
          value: {
            id: 4,
            myProp: 4,
          },
        },
        {
          indexDiff: null,
          newIndex: null,
          prevIndex: 0,
          status: 'deleted',
          value: {
            id: 1,
            myProp: 1,
          },
        },
      ],
      status: 'updated',
      type: 'list',
    });
  });
});
DoneDeal0 commented 9 months ago

Salut Thomas,

Thank you for using the lib! I understand your request. The behavior of getListDiff is technically normal because the objects { id: 2, myProp: 2 } and { id: 2, myProp: 222 } are different. I could definitely add a referenceProperty option to consider an object as updated instead of deleted if its reference property - here the id - hasn't changed.

I don't have much time to do this right now, but I'll look into it as soon as possible. This is a good suggestion.

Thank you very much!

DoneDeal0 commented 9 months ago

The fix is out! Let me know if it works for you!

If your company uses the project for commercial purposes, feel free to support it.

Have a great weekend!

sneko commented 9 months ago

Thanks @DoneDeal0 for your reactivity. I will try it out next week.

It's not for commercial use but I appreciate your work, +1 for a coffee ☕

DoneDeal0 commented 9 months ago

Thank you very much!

sneko commented 9 months ago

@DoneDeal0 for your information, there is a mismatch between the release version of https://github.com/DoneDeal0/superdiff/releases/tag/v1.1.1 and https://www.npmjs.com/package/@donedeal0/superdiff?activeTab=versions.

sneko commented 9 months ago

@DoneDeal0 just tried the release, with my test above your new version is telling the object with id: 2 is moved whereas it should tell it's updated (or it's too subjective, but in my case I prefer to know an object is modified over it has a new position in the list).

DoneDeal0 commented 9 months ago

Please check this commit . It adds an option considerMoveAsUpdate to return an updated status instead of moved if you so desire.

You're right, there was a mismatch between versions on npm. v1.1.1 is now out. Sorry for the inconvenience.

sneko commented 9 months ago

@DoneDeal0 sorry if I have not been clear explaining the initial case. To give more context, when managing a datastore it's common to manage multiple items in a single endpoint input.

I'm in the case where I get a CSV file regularly, and I need to synchronize it into my database. Depending on the previous version into the datastore and the new version you have to set, you will delete some in the database, update some (thanks to referenceProperty you implemented), and create some. The rest is just "idle/equal" and has no reason to result in an action (I'm fine to consider in my case equal/moved the same because the order does not matter in most cases.

The considerMoveAsUpdate will just transform all moved into updated whereas in the workaround code I made https://github.com/DoneDeal0/superdiff/issues/14#issuecomment-1898426766 the updatedwas deduced by an item having the same referenceProperty but with modifying data on the rest of the object.

I'm fine with staying with my workaround, I just wanted to clarify (and so, to me the current considerMoveAsUpdate could even not exist).

It's maybe too specific to the case of databases and I can hear it has no reason to be managed in this library, sorry for extending the thread 😄

EDIT: "too" specific because since an item can be both moved in position AND/OR updated in state at the same time, or even more equal in state AND/OR moved in position. It depends on how people prefer to manage those... should those pair be "mixed" with one having prevalence on the other (e.g. updated overwrites moved if both true), or should an item an status array to keep track of both... It's definitely a thought to have depending on the context.