Detecting transaction retries

keijokapp commented 1 month ago

I'm in a little bit of an XY problem situation. There's a known limitation in FDB client API, which is unlikely going to be resolved in the foreseeable future. So I need to work around that. If there's a [better] way to solve it, I'd be interested.

Setting version stamped entries makes those entries unreadable. But it is sometimes necessary to access those entries after they have been set. The only way to "access" them is to also store them in some other storage. (I guess I could also use transaction_bypass_unreadable but that seems too hacky to be useful.) Whatever that external storage would be (eg. WeakMap, magic property on some object), it has to be tied to a specific transaction run. When transaction is retried, that storage should reset.

The library doesn't provide a way for abstractions to detect that the transaction has been restarted and it should dispose the external state created by the previous run. Nor does it provide a way to identify the underlying transaction without using internal _tn or _ctx properties.

My proposal is to use Transaction#_ctx for this. It is already used for a very similar reason internally (providing a central state across multiple operations). A new TxnCtx should be created for each transaction run instead of resetting its properties. That object can be used as a key in the WeakMap or it could be manipulated directly.

There are 3 ways to go with it:

Expose Transaction#_ctx as is (eg. Transaction#context).
Expose Transaction#_ctx and refactor it a bit for more generic use cases.
Create an entirely new property.

I would go with option (2). TxnCtx#nextCode could be turned into a generic counter, so that abstractions don't need to implement their own counter to serialize their operations. TxnCtx#toBake could be turned into an array of arbitrary callbacks to be run after a successful commit. Both of these features would be useful based on my experience.

atn34 commented 1 month ago

Can you reset the map at the beginning of the transaction?

keijokapp commented 1 month ago

Yes, but the issue is with abstractions that do not have a way to know about the beginning of the transaction. In my case, the abstraction and the code managing the transaction lifecycle are in different code bases, possibly developed by different developers. Note that the abstraction could be used multiple times during the run so it can't just reinitialize whenever it's invoked.

As an abstraction developer, I can't expect the user land code (that manages the transaction lifecycle) to initialize these opaque data structures properly and pass them, possibly through a stack of abstractions, to my abstraction. Also, from the user's perspective, it would be unreasonable to expect them to reliably initialize and keep track of these data structures for each abstraction that may or may not use them.

Here's an example. I'm developing a FoundationDB adapter for PouchDB as a side project. I have a following simplified use case:

db.doTn(async tn => {
  const pdb1 = new PouchDB({
    name: 'foo',
    adapter: 'foundationdb',
    tn
  });

  // sets a version stamped entry
  await pdb1.put({ _id: 'bar' })

  const pdb2 = new PouchDB({
    name: 'foo',
    adapter: 'foundationdb',
    tn
  });

  // reads a version stamped entry
  await pdb2.get('bar');
});

Note that the abstraction is invoked multiple times. While it's possible to reuse pdb1 here, in reality, the PouchDB calls could be buried anywhere.

While it's technically possible to create an opaque data structure (eg. context) before the first PouchDB call and pass that to the constructors, it's an abstraction nightmare. The abstraction has to trust the user land that context is initialized properly (eg. not cached across retries) and only one context is created for each run. Besides, there's already a context object right there on the transaction object serving largely the same purpose, except that it's only for internal use and isn't fully reset on retries.

To illustrate the point further, imagine if, instead of this library keeping track of version stamp codes internally, the bindings would require the user to initialize the TxnCtx themeselves and pass it to each setVersionstamped* call.

Also, directory layer already uses the WeakMap approach (using internal Transaction#_tn property) for queuing so clearly there's a value in abstracting that stuff away.

keijokapp commented 4 weeks ago

@josephg Have you been able to think about it? I see it being quite critical because it prevents an entire large class of use cases. Are there any issues with mu proposals?

I guess one consideration would be the asymmetry between the "root" transaction and scoped transactions. The property would be reset only for the root transaction and not the scoped transactions. But I think this asymmetry already exists. I think all scoped transactions should be considered invalid anyway.

EDIT: submitted comment accidentally so changing it now.

keijokapp commented 4 weeks ago

I'd like to also address another (related) issue which may also affect the directory layer.

What happens if the transaction gets retried while there are still jobs in the queue? Those jobs would be executed as part of the new run which may cause problems. This may not be a significant issue with the directory layer but it is for other abstractions. I'm not aware of any workarounds.

To avoid that, it would be best if each run gets a fresh transaction object and calls on the old object (and scoped transaction objects) fail.

josephg / node-foundationdb

Detecting transaction retries #81