tc39 / proposal-upsert

ECMAScript Proposal, specs, and reference implementation for Map.prototype.upsert
https://tc39.es/proposal-upsert/
MIT License
204 stars 14 forks source link

Update Proposal from Feedback #21

Closed bmeck closed 1 month ago

bmeck commented 4 years ago

Following a meeting to try and understand the underlying concerns from Mozilla I think we should move the API to focus on insertion but allow for co-location of updating. Following the DOM API design for .addEventListener can take an options bag to define the handler I think the following API would be sufficient:

type MapEmplaceOptions<V> = {
  update?(existing: V): V
  insert(): V
};
type Map<Key, Value> {
  /**
   * Call signature with a options bag handler
   */
  emplace(key: Key, handler: MapEmplaceOptions<Value>): Value;
}

emplace might be a naming bike shed hazard but it fairly well matches C++ and I cannot think of a simple name that would improve upon readability.

Such an implementation ~=

class Map<Key, Value> {
  // ...
  /**
   * @param {Key} name
   * @param {{update?(existing: Value): Value, insert(): Value} | {():Value}} handler
   */
  emplace(name, handler) {
    if (this.#elements.has(name)) {
      if ('update' in handler) {
        const existing = /** @type {Value} */(this.#elements.get(name));
        const ret = handler.update(existing);
        this.#elements.set(name, ret);
        return ret;
      }
      return this.#elements.get(name);
    } else {
      // intention is to error rather than return `undefined`
      const ret = handler.insert();
      this.#elements.set(name, ret);
      return ret;
    }
  }
}

Usage would be like:

const counters = new Map();
function tick(id) {
  return counters.emplace(id, {
    insert: () => 0,
    update: n => n + 1
  });
}

Which would have improved readability vs an insert-only approach to this use case:

const counters = new Map();
function tick(id) {
  // use -1 due to not having a separated state
  const n = counters.insert(id, () => -1) + 1;
  counters.set(id, n);
  return n;
}

It would also have improved readability vs an update and error if missing approach:

const counters = new Map();
function tick(id) {
  // variable has to be hoisted due to error as signal needing new scope
  let n;
  try {
     n = counters.update(id, n => n + 1);
  } catch (e) {
     // this is fragile detection (cross Realm/might have to be string based/forgeable), only want to catch missing key not errors from handler
     if (e instanceof MissingKeyError) {
       n = 0;
       counters.set(id, n);
     } else {
       // loses stack traces in engines
       throw e;
     }
  }
  return n;
}
friendlyanon commented 4 years ago

If you would like to model the behaviour of addEventListener, then please do take care that the methods aren't called without a receiver:

window.addEventListener("custom-event", {
  prefix: "hello",

  handleEvent(event) {
    console.log("%s %s", this.prefix, event.detail);
  },
});

window.dispatchEvent(new CustomEvent("custom-event", { detail: "world" }));
// The above line will cause "hello world" to be printed

The implementation shown as an example would not work in the same way.
As an example, please see the example below that one would expect to work:

class GroupByEmplacer {
  #value;

  setValue(value) {
    this.#value = value;
    return this;
  }

  update(group) {
    group.push(this.#value);
    return group;
  }

  insert() {
    return [this.#value];
  }
}

function groupBy(iterable, key) {
  const keyMapper = typeof key === "function" ? key : x => x[key];
  const emplacer = new GroupByEmplacer();
  const map = new Map();

  for (const value of iterable) {
    const mappedKey = keyMapper(value);

    map.emplace(mappedKey, emplacer.setValue(value));
  }

  return map;
}

While this can be trivially restructured to not depend on the methods being called with a receiver, I see no utility in calling them without receivers.

bmeck commented 4 years ago

Seems fine here, just was personal style to drop receivers since when you pass a function in it gains implicit access to the object via this which can do things like mutate the options bag if it isn't created for every call site. I don't see any practical need to guard the options bag unlike things we saw in PrivateName discussions. Amended pseudoimpl

ljharb commented 4 years ago

It seems very strange to me that handler is required, and must either be a function or an object with 1 or 2 magic function properties. Why is that a simpler API than (key, insert[, update)?

(Separately, I find "emplace" to be a hugely confusing name as a non-C++ dev, but that bikeshed likely isn't worth having until the semantics and API are largely settled)

bmeck commented 4 years ago

@ljharb a decent amount of the feedback was of 2 categories:

  1. there is not precedent of a 2 callback API so it is seen as odd
  2. the position determining the ordering makes it hard to determine which callback does what. Easier to name things.

Per an insert operation being required, this is to deal with existing errors you see when checking JS code:

let map; // Map<string, string>
let x; // string
if (map.has(k)) {
  x = map.get(k);
} else {
  x = {};
  map.set(k, x);
}

Will actually give warnings/errors from tools since has/get not being atomic means that in theory x could be undefined. My recommendation is to use get if you don't need to ensure the item is in the map or you can craft a different composition of methods if you need some other workflow.

bmeck commented 4 years ago

@ljharb to be clear it is a blocker that we not use functions as 2 parameters.

ljharb commented 4 years ago

In that case, why not require one object with all the functions, rather than accepting the bare function form? That way the semantics of "one function" are similarly clear.

bmeck commented 4 years ago

@ljharb seems fine to me but convenient to have a shorter hand, idk about feedback from others if that would be problematic to require it to always be an object.

ljharb commented 4 years ago

If convenience is the goal then I'd expect two functions :-) if readability is more important (and if the 2 function form is somehow deemed unreadable), then I'd expect convenience to take a backseat.

bmeck commented 4 years ago

@ljharb I've updated the design to not allow a single function and now must be an options bag.

ljharb commented 4 years ago

My remaining feedback:

  1. the callback should be eagerly looked up from the options object, just like resolve is eagerly looked up from the constructor in all the Promise combinators (which was motivated by implementor optimization desires)
  2. the callback should be invoked with an undefined receiver or the global, just like Promise callbacks are (also, like all array callbacks are, when no thisArg is provided).
bmeck commented 4 years ago

@ljharb given implementor feedback precedent for Proxy and https://github.com/tc39/proposal-collection-normalization/issues/15#issuecomment-515337415 I don't think we should mandate either of those designs, and I won't be changing the behavior prior to plenary. It seems fine to discuss in committee and if no objections occur we can change them then.

gibson042 commented 4 years ago

Review feedback:

  1. I would echo the need for an explicit decision regarding eager vs. lazy lookup of update and insert.
  2. What should emplace do when the handler argument is a non-null non-undefined primitive? There's precedent at least for boxing into Object (e.g., Object.assign) and for throwing (e.g., Object.defineProperty and Proxy), so I think either approach is defensible. There's also the sub-question of whether to allow or reject null (for the record, the ECMA 262 functions seem to treat null like undefined but Intl functions throw on null via ToObject but box other non-undefined primitives).
  3. Should the handler functions receive a pre- or post-normalization key? For example, does (new Map()).emplace(-0, { insert: key => Object.is(key, -0) }) return true or false?
ljharb commented 4 years ago
  1. The precedent set by the Promise combinators (with Promise.resolve) as well as most other algorithms suggests to me that "eager" is the correct approach.
  2. I would expect a non-undefined argument to use ToObject, or RequireObjectCoercible - ie, null should throw. There's a precedent in ES for an optional argument of type X to allow undefined or X, but otherwise throw (for example, all the places an optional callback is accepted will throw on any non-undefined non-function). In this case, it's meant to be an object, so any non-undefined non-object should throw.
bmeck commented 4 years ago

🚫 Blocking🚫

We seem to have a variety of places in spec that do both eager or lazy grabbing of arguments. I'd note that in this case the execution of these paths for insert and update are mutually exclusive. The only thing at a glance that has the same kind of mutual exclusivity for a method is how iterators conditionally call iter.throw() or iter.next() when interacting with yield. It would be good to decide on this pattern and document it before moving forward with this proposal.

It seems the spec currently has a variety of scenarios but very few with mutual exclusive invocation of callbacks. I'd note that if this is eager it may invoke a MOP proxy handler that is not used. We have heard some criticism that any property lookup is potentially to expensive, so I'd prefer lazy given that they are mutually exclusive and only 1 is used for any given usage of .emplace(). It seems odd to do a property lookup if we know it won't be used.


Should the handler functions receive a pre- or post-normalization key? For example, does (new Map()).emplace(-0, { insert: key => Object.is(key, -0) }) return true or false?

This is a good and thorough question!


What should emplace do when the handler argument is a non-null non-undefined primitive? There's precedent at least for boxing into Object (e.g., Object.assign) and for throwing (e.g., Object.defineProperty and Proxy), so I think either approach is defensible. There's also the sub-question of whether to allow or reject null (for the record, the ECMA 262 functions seem to treat null like undefined but Intl functions throw on null via ToObject but box other non-undefined primitives).

I feel that throwing is likely the right choice. I don't believe we will add a .update or .insert method to any primitive so we end up with a few points:

I'd prefer we had some kind of write up explaining this somewhere for future reference if we can agree on the points above being reasons that any given proposal may want to avoid ToObject or conversely in the case of Object.assign to actively call ToObject for the given proposal's use case.

ljharb commented 4 years ago

If a property lookup - which should be one of the cheapest operations we have - is too expensive, then I'd say that throws the entire idea of a "handler" object into question.

this means it must come from an object.

we do, in fact, have to take into account someone installing a insert or update onto a boxable primitive's [[Prototype]], so if we want to ensure it's an object, we need to check Type() Object here.