tc39 / proposal-collection-normalization

MIT License
41 stars 8 forks source link

Set of elements with custom equality #18

Closed bergus closed 3 years ago

bergus commented 4 years ago

I think currently this proposal is missing out on a very common use case that should be in its scope: storing elements in a collection that is keyed by a custom uniqueness property. Basically a database relation with a primary key.

The discussion in #5, #6 and #7 (that led to the When are the normalization steps applied? FAQ entry) made it clear that the coercion functions are normalising the arguments of all collection methods and change the values that are actually stored. As @domenic put it:

I realize that this probably conflicts with some other uses cases in this proposal, of allowing a set to still "have" a complicated object even if it's "keyed by" a simple primitive.

While I definitely see the benefits of normalisation (mapping and even filtering by throwing exceptions) being built into a Map/Set instance over wrapping every interaction where the collection is created or mutated, a way to keep the values intact but key them for a custom equality is still missing. The array unique proposal has the same motivation, and the community is looking for this since years.

How I'd imagine this to work would be

const persons = new Set(null, {
  coerceValue(x) { if (x instanceof Person) return x; else throw new TypeError("expected a Person"); },
  coerceKey(x) { if (x instanceof Person) return x.email; else if (typeof x == "string") return x; },
});

persons.add(new Person("jd@example.com", "Jane Doe"));
persons.add(new Person("smith@example.edu", "R. Smith"));
persons.add(new Person("jd@example.com", "John Doe"));

console.log(persons.has(new Person("smith@example.edu", undefined)); // true
console.log(persons.has("jd@example.com")); // true
console.log(persons.has(5)); // false

console.log(...persons) // {Jane Doe <jd@example.com>}, {R. Smith <smith@example.edu>}

Notice that this is still a Set, not a personsByEmail Map. It does have an add method and no get or set methods. Its .keys() iterator would still return the elements like .values() and [Symbol.iterator](). The key that the elements are compared (and can be hashed) by is internal only and stored in the [[SetData]], but not exposed in entry tuples. This could of course be implemented using a map, but the Map interface is not ergonomic to use in this case where we don't want a lookup data structure but a unique-constrained collection. Inserting values would require .set(x, x), and the map (or its coercion hooks) could not guard against someone doing .set(x, y). Wrapping or subclassing it has the same drawbacks as outlined in the readme.

Regarding the name of the hook, coerceKey seems fine since these Sets actually have internal keys and it's consistent with the Map hooks, but I'm also open for bikeshedding something new (getKey, keyBy, toKey, uniqueBy).

bmeck commented 3 years ago

This seems like a good but different feature. Giving unique identity per collection for comparison purposes is still needed even if normalized E.G. if a map always coerces to strings, it may do case insensitive comparison.