tc39 / proposal-upsert

ECMAScript Proposal, specs, and reference implementation for Map.prototype.upsert
https://tc39.es/proposal-upsert/
MIT License
202 stars 14 forks source link

Suggestion: lock the map during access to prevent recursive modification #40

Open dead-claudia opened 3 years ago

dead-claudia commented 3 years ago

Essentially, what I'm proposing is that emplace should set a flag to block recursive updates to the same map or weak map, whether it be through set, delete, or emplace. This enables implementations to not have to do nearly the bookkeeping when managing updates.

It also exposes potential bugs to the user - recursive updates are rarely intended, and if absolutely desired, you can just do something like this to still work around the lock and avoid it:

let deleteMarker = Symbol("deleteMarker")
// Push `[key, deleteMarker]` to trigger delete or `[key, value]` to trigger set
let queued = [[initialKey, initialValue]]

while (queued.length) {
  let [key, value] = queued.shift()
  if (value === deleteMarker) {
    map.delete(key)
  } else {
    map.emplace(key, {
      insert: () => {
        // do insert things
        return newValue
      },
      update: v => {
        // do update things
        return newValue
      },
    })
  }
}

*This is only technically the case if the hash table is represented as an array of inline key/value pairs (as in something like std::vector<HashTable>) rather than an array of pointers to entries (as in something like std::vector<std::shared_ptr<HashTable>>). For efficiency, nearly every hash table implementation in practice does the former if they want to aim for any serious performance, and of course, every engine I'm aware of also does it that way, and so yeah, it's widely enough applicable for me to say "always" here as that's in practice the case because of how everyone implements it.

dead-claudia commented 3 years ago

I just found this bit about lower-spec VMs, and this just reinforces this suggestion - this really feels like an algorithmic oversight.

acutmore commented 1 week ago

The latest spec still seems to have this issue. Would be great to decide on a solution.

Right now there is no check after the callback is invoked

https://github.com/tc39/proposal-upsert/blob/8668664a3d2e298d183d3d57fb6a548869ec3dbf/spec.emu#L45

That mapData hasn't been updated before we get to

https://github.com/tc39/proposal-upsert/blob/8668664a3d2e298d183d3d57fb6a548869ec3dbf/spec.emu#L47

Meaning there could end up with two records with matching keys.

I personally like the idea of throwing a re-entrancy error in map.set. Though as that is a method that does not throw today I can see how this may be unpopular

dead-claudia commented 1 week ago

Alternatively, a spec note could be used to recommend a special boolean "has reserved entry" flag to make the added check cheap:

This could be further optimized by keeping the old value (as just a raw integer reference value - it can't be followed by GC) and comparing that for equality instead. Then, the "clear" woukd be setting it to a sentinel, the "set" would be setting it to the key, and the test would be comparing for equality. And this pattern could then be extended to has/get/update by storing the value of the most recently looked-up entry. (IIRC this is not far off of how WebKit optimized has+get pairs, but this is much more general.)