nim-lang / RFCs

A repository for your Nim proposals.
135 stars 26 forks source link

versioned containers (Tables etc) to enable value caching and cache invalidation #386

Open timotheecour opened 3 years ago

timotheecour commented 3 years ago

proposal

add a read-only mutations: int field to containers like Tables, StringTableRef, to indicate the number of mutations that occurred in the container, and an accessor proc mutations*(self: Container): int to access it; the field simply gets incremented on each mutation (eg: Table.[]=)

example use case 1:

allows user to tell whether a key help by pointer (eg via gePtr (https://github.com/nim-lang/Nim/pull/18253) or mitems.addr) is still valid at some later point. eg:

var t = someTable()
let val = mgetOrPut(t, "foo", val).addr
let mutations = t.mutations
# call some code, that potentially could mutate/rehash `t` based on some runtime conditions
if t.mutations == mutations:
  # no mutations have occurred, val is still valid
else:
  fetch val again

example use case 2:

(similar to case 1 but a bit different) This allows caching a value (could be by value, not necessarily by its address) in cases where we want to avoid table lookups (which, despite being O(1) in practice, are still much slower than a direct pointer dereference) while the table hasn't been mutated.

https://github.com/nim-lang/Nim/pull/18244 is using this strategy (EDIT: it'll use something else in the end, but for unrelated reasons, the idea remains valid) where I'm avoiding a table lookup in each VM execution step, and instead use a cached value that gets invalidated upon mutations. This pattern can be reused elsewhere for performance.

example use case 3:

solve https://github.com/nim-lang/Nim/issues/18261 in a robust way, preventing mutating a table during iteration, instead of the following weaker test:

assert(len(t) == L, "the length of the table changed while iterating over it")

links

mratsim commented 3 years ago

Caching is a complex problem and often finte-tuned to the domain at hand. And if caching is needed custom data structures were possibly used first.

Furthermore in the case of tables, users would likely need a key-value store or database anyway for persisting data. They come with their own cache and perf-tuning.

If so, I don't think this is needed in the standard library unless it's needed for Nim itself.