angelsolaorbaiceta / fe-fwk-book

Learn how frontend frameworks work by building your own. Code for the book "Build a frontend framework from scratch", published by Manning.
http://mng.bz/aM2o
MIT License
121 stars 27 forks source link

fix (runtime): `arraysDiff()` with duplicates #263

Closed angelsolaorbaiceta closed 3 months ago

angelsolaorbaiceta commented 3 months ago

arraysDiff() handles duplicate entries

Closes #260 .

As noted by @terrence-ou , the current implementation of the arraysDiff() function doesn't deal with duplicate values. For example:

const oldArray = []
const newArray = [1, 1]

arraysDiff(oldArray, newArray)

Would incorrectly return:

{
  added: [1],
  removed: [],
}

When the correct solution would have been that two 1s were added:

{
  added: [1, 1],
  removed: [],
}

This error didn't surface earlier because, the only use of the arraysDiff() function is to find the CSS classes that were added/removed from an element. The browser deduplicates the CSS classes added to an element, and thus, this error has no visible effect in the inner working of the framework. It's nevertheless a very interesting problem.

The Problem

The problem is in how the added and removed items are searched for:

That includes() usage stops at the first match, thus missing duplicates.

 Solution

Solving this problem requires to count how many times a given item appears in each array. A count map is created for each array for this purpose:

export function arraysDiff(oldArray, newArray) {
  const oldsCount = makeCountMap(oldArray)
  const newsCount = makeCountMap(newArray)

   ...
}

The makeCountMap() function is straightforward to implement:

export function makeCountMap(array) {
  const map = new Map()

  for (const item of array) {
    map.set(item, (map.get(item) || 0) + 1)
  }

  return map
}

Then, we diff the keys of both maps:

const diff = mapsDiff(oldsCount, newsCount)

[!NOTE] Note that we need to use maps and not objects, because objects don't allow types other than string and Symbol to be the keys. Maps don't impose a type to be used as keys.

The mapsDiff() function implementation is equivalent to the objectsDiff() one, but this time, the keys can be anything and not just strings or Symbols:

export function mapsDiff(oldMap, newMap) {
  const oldKeys = Array.from(oldMap.keys())
  const newKeys = Array.from(newMap.keys())

  return {
    added: newKeys.filter((key) => !oldMap.has(key)),
    removed: oldKeys.filter((key) => !newMap.has(key)),
    updated: newKeys.filter(
      (key) => oldMap.has(key) && oldMap.get(key) !== newMap.get(key)
    ),
  }
}

Then, in the arraysDiff() function, we have not only to see what items were added or removed, but count how many times they were added or removed:

export function arraysDiff(oldArray, newArray) {
  const oldsCount = makeCountMap(oldArray)
  const newsCount = makeCountMap(newArray)
  const diff = mapsDiff(oldsCount, newsCount)

  // Added items repeated as many times as they appear in the new array
  const added = diff.added.flatMap((key) =>
    Array(newsCount.get(key)).fill(key)
  )

  // Removed items repeated as many times as they appeared in the old array
  const removed = diff.removed.flatMap((key) =>
    Array(oldsCount.get(key)).fill(key)
  )

  ...
}

And lastly, for the keys that were updated (items that appear in both arrays, but in a different number) we want to calculate the difference between how many times it appears in the new versus the old array. Based on the result, we include it as addition or removal:

  // Updated items have to check the difference in counts
  for (const key of diff.updated) {
    const oldCount = oldsCount.get(key)
    const newCount = newsCount.get(key)
    const diff = newCount - oldCount

    if (diff > 0) {
      added.push(...Array(diff).fill(key))
    } else {
      removed.push(...Array(-diff).fill(key))
    }
  }

And hence the complete implementation is:

export function arraysDiff(oldArray, newArray) {
  const oldsCount = makeCountMap(oldArray)
  const newsCount = makeCountMap(newArray)
  const diff = mapsDiff(oldsCount, newsCount)

  // Added items repeated as many times as they appear in the new array
  const added = diff.added.flatMap((key) =>
    Array(newsCount.get(key)).fill(key)
  )

  // Removed items repeated as many times as they appeared in the old array
  const removed = diff.removed.flatMap((key) =>
    Array(oldsCount.get(key)).fill(key)
  )

  // Updated items have to check the difference in counts
  for (const key of diff.updated) {
    const oldCount = oldsCount.get(key)
    const newCount = newsCount.get(key)
    const delta = newCount - oldCount

    if (delta > 0) {
      added.push(...Array(delta).fill(key))
    } else {
      removed.push(...Array(-delta).fill(key))
    }
  }

  return {
    added,
    removed,
  }
}

Thanks @terrence-ou for reporting this error! 🤝

terrence-ou commented 3 months ago

Hi @angelsolaorbaiceta

Thanks for the detailed explanation! Your code is an excellent example of practicing SOLID principles. I always learn a lot from your implementations and explanations.

This update perfectly solves the duplication problems. Since the sequence of items in the output arrays is not essential, I updated the testing method to only consider the count of items being added or removed:

  it("array items being removed", () => {
    const oldArr = [1, 1, 2, 2, 1, 4, 3, 4, 5, 1, 1, 5, 7, 8, 9];
    const newArr = [1, 2, 3, 4];
    const diff = arraysDiff(oldArr, newArr);
    expect(makeCountMap(diff.added)).toEqual(new Map([]));
    expect(makeCountMap(diff.removed)).toEqual(
      new Map([
        [1, 4],
        [2, 1],
        [4, 1],
        [5, 2],
        [7, 1],
        [8, 1],
        [9, 1],
      ]),
    );
  });

I have one additional thought regarding the for loop in the arraysDiff(oldArray, newArray) function:

for (const key of diff.updated) {
  const oldCount = oldsCount.get(key)
  const newCount = newsCount.get(key)
  const diff = newCount - oldCount

  if (diff > 0) {
    added.push(...Array(diff).fill(key))
  } else {
    removed.push(...Array(-diff).fill(key))
  }
}

The variable name diff is similar to the object diff we’re looping over. While it won’t break the program or tests, it could potentially cause some confusion. Do you think we could use a name like diffCount to avoid such confusion?

angelsolaorbaiceta commented 3 months ago

Hi @angelsolaorbaiceta

Thanks for the detailed explanation! Your code is an excellent example of practicing SOLID principles. I always learn a lot from your implementations and explanations.

This update perfectly solves the duplication problems. Since the sequence of items in the output arrays is not essential, I updated the testing method to only consider the count of items being added or removed:

  it("array items being removed", () => {
    const oldArr = [1, 1, 2, 2, 1, 4, 3, 4, 5, 1, 1, 5, 7, 8, 9];
    const newArr = [1, 2, 3, 4];
    const diff = arraysDiff(oldArr, newArr);
    expect(makeCountMap(diff.added)).toEqual(new Map([]));
    expect(makeCountMap(diff.removed)).toEqual(
      new Map([
        [1, 4],
        [2, 1],
        [4, 1],
        [5, 2],
        [7, 1],
        [8, 1],
        [9, 1],
      ]),
    );
  });

I have one additional thought regarding the for loop in the arraysDiff(oldArray, newArray) function:

for (const key of diff.updated) {
  const oldCount = oldsCount.get(key)
  const newCount = newsCount.get(key)
  const diff = newCount - oldCount

  if (diff > 0) {
    added.push(...Array(diff).fill(key))
  } else {
    removed.push(...Array(-diff).fill(key))
  }
}

The variable name diff is similar to the object diff we’re looping over. While it won’t break the program or tests, it could potentially cause some confusion. Do you think we could use a name like diffCount to avoid such confusion?

You are very right! I'll rename that variable to make it clearer. Thanks for the feedback.