slightlyoff / async-local-storage

Apache License 2.0
153 stars 13 forks source link

Support for atomic operations #1

Open sicking opened 11 years ago

sicking commented 11 years ago

The current API doesn't enable doing things like "add 1 to the value of the a property". Consider the following code

var storage = ...;
storage.get("a").then(function(v) {
  storage.set("a", v+1);
});

If the user has the same page open in two tabs, the above code risks that both tabs reads the same value from "a", and then whichever tab runs the set operation first will have it's value overwritten by the other tab.

Two possible solutions for this is sketched up in my comment here: https://hacks.mozilla.org/2012/03/there-is-no-simple-solution-for-local-storage/comment-page-1/#comment-1411394

Unfortunately neither solution fits really well with how Futures work since there is additional state carried in addition to the returned value. I.e. a callback behaves differently if it is registered before the Future is resolved or if it is registered after the Future is resolved. This obviously would be a bit of a violation of the Future API.

mounirlamouri commented 11 years ago

For the inial issue, I guess we could have something like:

storage.lock().then(function() {
  return storage.get('a');
}).then(function(v) {
  return storage.set('a', v+1);
}).then(function() {
  storage.unlock();
});

I'm not sure how we could solve that directly with Future.

sicking commented 11 years ago

I'm not a big fan of having an explicit .unlock() function. It makes it all to easy to forget to unlock which will cause the storage area to be unusable. In your example code above, if the .get() or .set() function fails with an IO error, the storage area will never be be unlocked.

What both WebSQL and IndexedDB do is that they automatically close the lock when no more requests are being placed. So if the last callback for a .get/.set call fires, without any new .get/.set calls being scheduled during that callback, the lock is automatically closed and the next lock can start executing.

slightlyoff commented 11 years ago

I don't like .lock()/.unlock() either. The current impl in the polyfill might do the right thing in this case on Chrome/IE as I'm reusing transactions where possible...although it looks like this won't happen in FF due to the bug I'm working around here: https://github.com/slightlyoff/async-local-storage/blob/master/src/async-local-storage.js#L190 (short story: FF fires oncomplete et. al. at a point after the txn is actually closed whereas Chrome doeesn't).

IDB's lack of explicit transaction lifetime support is deeply vexing. I can imagine the LS API doing something like:

storage.transact(function(txnStorage) {
  // We close the txn based on the done() of the returned Future or timeout
  return txnStorage.get("a").then(function(v) {
    txnStorage.set("a", v+1);
  });
}/*FIXME: timeout value here?*/).done(displaySuccessUI);
sicking commented 11 years ago

The reason IDB doesn't have explicit transaction lifetime support is for the same reason mentioned in my comment above. I.e. it's too easy to screw up resulting in databases being long periods of time.

The complete event has to be fired after the transaction is complete since it's the event that signals that the transaction was successfully committed to disk. I.e. it's what implements the D in ACID. I don't understand how Chrome or IE could implement anything else. It definitely goes against what the spec demands.

With your proposal, at which points in time are you allowed to call txnStorage.get/set? In IDB (and I believe WebSQL) you are only allowed to place new "operations" against a transaction during callbacks from other operations placed against the transaction. That has the advantage that it significantly reduces the risk of timing hazards in the code.

I.e. a page can't do other asynchronous operations and in the callbacks from those asynchronous operations place new operations against the transaction. That never works since those would then be placed from outside of an operation callback.

However if you are allowed to place operations against a transaction as long as the transaction hasn't been committed then placing operations from handlers of other asynchronous actions may or may not succeed depending on the speed of the database implementation, speed of device IO and speed of the other asynchronous operations.

mounirlamouri commented 11 years ago

If the main argument against having a lock()/unlock() behaviour is the risk of developers forgetting to unlock, we could imagine that the unlock() would automatically happen at the end of the promise chain.

Basically, something like:

storage.lock().then(function() {
  return doSomething()
});

would behave like:

storage.lock().then(function() {
  return doSomething()
}).done(function() {
  unlock(); 
});
mounirlamouri commented 11 years ago

Argh, I didn't mean to send the comment...

I wanted to add that I am not sure that such a behaviour is actually implementable without hacking the platform. The platform could have a special behaviour that automatically unlock after a lock() chain but I am not sure how that could be done with the web facing Future API.

slightlyoff commented 11 years ago

@sicking : There's nothing wrong with calling oncomplete later than end-of-txn; the issue here is that FF calls back into user code that can still access the txn before that event is dispatched but after the transaction is no longer "active". Other browsers, apparently, close transactions and dispatch events before re-entering user code; at least in this case of sequential operations. Another way to think about this is that the bug is that there's no visible active state property on the transaction so you don't know if it's done or not should the ordering of dispatch be what FF chose to do. Massive spec screwup, but easily corrected: demanding an internal state variable that implementations must track but which isn't visible to users is a key sign you're doing something wrong if the effects of that state are visible in any way (and they are here via the exceptions that are thrown). The fix is to just expose a .active getter/property on txn objects.

Anyhow, back to the discussion of transaction lifetime: what my code above proposed was, in essence, the ability to extend the life of the transaction to the done() of any returned Future from .transact(); much the way .then() watches for Future object returns and chains to them. A (short) default timeout + auto-close can keep us from getting into huge trouble over locking.

The biggest win of this API is that it naturally slots in sub-transactions. E.g., you can naturally do:

storage.transact(function(txnStorage) {
  return txnStorage.get("a").then(function(v) {
    return txnStorage.transact(function(subTxnStorage) {
       subTxnStorage.set("a", v+1);
    });
  });
}).done(displaySuccessUI);

If any sub-txns fail, they can be handled without invalidating the entire transaction, and separate timeouts can be set for sub-transactions and the overall transaction.

sicking commented 11 years ago

The distinction is, what would the following do:

storage.transact(function(txnStorage) {
  setTimeout(function() {
    txnStorage.set("b", 1);
  }, 10);
  return txnStorage.get("a");
}).done(displaySuccessUI);

would the key "b" be set to 10 if the .get() executes in less than 10ms? In IndexedDB we designed it such that, in code like the above, "b" would never be set to 10 and the .set() call would always throw.

This was done by only enabling the transaction object to be used during callbacks from other operations against the transaction. I think this is a preferable model.

The main concern that I think we had was that people would do asynchronous network requests after starting the transaction, and use data from the responses in the transaction. This will work or not depending on timing of the database, and of the network request.

sicking commented 11 years ago

Actually, nevermind the network request part. Any asynchronous actions, no matter how quickly they perform, will be racing with the database operations. For example the .get() above might execute quickly enough on a background thread that the event posted to the task queue will be posted before the transaction callback finishes. This would mean that any other asynchronous operation will be racing will it.