Open lionel-rowe opened 1 week ago
PRs that improve the performance of assertEquals()
for this scenario are welcome.
Two possible directions to explore:
const aEntries = [...a.entries()].sort();
const bEntries = [...b.entries()].sort();
for (const [idx, [aKey, aValue]] of aEntries.entries()) {
const [bKey, bValue] = bEntries[idx]!;
if (!compare(aKey, bKey) || !compare(aValue, bValue)) {
return false;
}
}
return true;
This passes for all current tests but I think it'll fail for more complex examples as sort()
just stringifies its args and doesn't modify order if the strings are equal (so [{ b: 2 }, { a: 1 }].sort()
is [{ b: 2 }, { a: 1 }]
because `"[object Object]" === "[object Object]").
const keySet = new Set<unknown>();
const valSet = new Set<unknown>();
for (const input of [a, b]) {
for (const [key, value] of input.entries()) {
keySet.add(key);
valSet.add(value);
}
}
return keySet.size === a.size && valSet.size === a.size;
This fails some of the current tests as sets are only unique by reference, not value.
Basically, the problem in both cases is not having a unique-by-value, reference-equal, serialized form.
I made this comparison with Lodash isEqual(). The difference is not as big I expected. Lodash code looks hacky.
import "https://raw.githubusercontent.com/lodash/lodash/4.17.10-npm/lodash.js";
import { assertEquals } from "jsr:@std/assert";
const arr1 = Array.from({ length: 10_000 }, (_, i) => i)
const arr2 = [...arr1]
const set1 = new Set(arr1)
const set2 = new Set(arr2)
let result: boolean
console.time('compare set std')
assertEquals(set1, set2);
console.timeEnd('compare set std')
console.time('compare set lodash')
result = _.isEqual(set1, set2)
console.timeEnd('compare set lodash')
console.log(result)
❯ deno run -A set.ts
compare set std: 616ms
compare set lodash: 317ms
true
A partial fix would be to have a fast path for cases where all keys of both are primitives, which would certainly have worked for the use case where I discovered this limitation, and maybe for most use cases where you'd be likely to want to compare large Set
s/Map
s using assertEquals
? Typically where the keys aren't primitive, you don't want to compare keys by value (because equality is determined by reference for purposes of setting/getting/etc).
Talking only about Sets, if all the values in either Set of the comparison has only primitives (you don't even need to check both Sets, just one of them), then you can compare them instantly with:
set1.symmetricDifference(set2).size == 0
I think this optimization is worth it because a Set usually has only primitives or only objects. I believe it's not common that a Set is populated with both primitives and objects. But I could be wrong.
Talking only about Sets, if all the values in either Set of the comparison has only primitives (you don't even need to check both Sets, just one of them), then you can compare them instantly with:
set1.symmetricDifference(set2).size == 0
I think this optimization is worth it because a Set usually has only primitives or only objects. I believe it's not common that a Set is populated with both primitives and objects. But I could be wrong.
Yeah I misspoke, only one set/map needs checking for exclusively primitive keys (because if the other has some non-primitive keys, at least one of the primitive keys must be missing due to having already passed same-size check). Probably worth adding symmetricDifference
as an extra-fast-path though.
Describe the bug
assertEquals
is very slow when comparing largeSet
andMap
objects due to using anO(n^2)
comparison algorithm:https://github.com/denoland/std/blob/7e2616031a2dcd9c405086192e53e39233370952/assert/equal.ts#L87-L99
It looks like there's the same issue in the
expect
version too.Steps to Reproduce
Results:
Results get exponentially worse as the size increases.
Expected behavior
assertEquals
ofSet
and array (orMap
and obj) to take approximately equal time.Environment