How should IDB connections and outstanding IDB transactions behave when a page enters back/forward cache?

rubberyuzu commented 2 years ago

When a page navigates away with an open IDB connection, the connection gets closed as the execution context gets destroyed (spec). However, as back/forward cache navigation preserves the execution context, if we allow pages with open IDB connections, they would remain open as is.

When should the page be marked unsalvageable after entering back/forward cache? For example, another page could issue a version update request, or delete request to the page in back/forward cache. In these cases, other pages can see the back/forward cache page, and thus probably should be marked as unsalvageable (evict the page from back/forward cache).

If this is the case, the spec here about IDB version change should change to say something like “If the page is not active but salvageable, do not send the versionchange event, but the page should be marked as unsalvageable”.

Similarly, when a page navigates away with outstanding IDB transactions, they normally get aborted. For back/forward cache navigation, is it okay to mark the page as salvageable even with the transactions, and let them complete?

There are guidelines here about how non-fully-active documents should generally behave with APIs, so please evaluate if these proposals fit the common patterns.

@fergald @dmurph @rakina

dmurph commented 2 years ago

I did an exploration of this a while ago. The best option I came up with was:

disallow page from bfcache if there are pending or active indexeddb transactions.
Allow a page into bfcache if there is a database connection, but keep track of the locked version (referenced by the connection)
- keep track in a table of origin + database name -> version locked
- Keep track of new database connection requests
- If a database connection request (origin + database name) requests a version that is different from the locked version, then kill the bfcache page with that connection.

This assumes pages mostly don't have active IDB transactions, which I think is a safe assumption.

Why keep track of the DB version? The current API requires the old database connection respond to an 'onversionchangerequested' event when a newer version is attempting to be opened. Since that old connection is in the cache & js is not executing, this doesn't really work, and would block the new connection from opening.

dmurph commented 2 years ago

You could technically allow outstanding transactions if you let them 'complete' (allow JS to keep running until they committed), and paused any newly created transactions from starting.

problems:

No guarantee that the txn will complete - in fact there was a common pattern of keeping a txn alive while some other task happened. could use this to keep JS executing on a tab
complicated to implement - requires another state of pausing

dmurph commented 2 years ago

Reading your comment again - apologies - you already mentioned this behavior :)

Adding @inexorabletash - perhaps you know who to route this to for spec changes.

My intuition would say:

We still want to abort outstanding transactions, and I'm tempted to say that the page should be 'unsalvageable' in this case, as otherwise we are introducing an edge case for developers that they probably don't handle well (all transactions could be aborted at any time & they would need to recover).
We can't really keep them operating / open due to the locks they hold.
- Well, we COULD do this but they we would want to mark the page as unsalvageable if any other connection was opened or txn attempted to start on the same object stores. Pretty hard to implement.

inexorabletash commented 2 years ago

If this is the case, the spec here about IDB version change should change to say something like “If the page is not active but salvageable, do not send the versionchange event, but the page should be marked as unsalvageable”.

SGTM

But I agree with dmurph re: outstanding transactions.

I'm the only active spec editor here (any volunteers?); PRs welcome to help address this!

rubberyuzu commented 2 years ago

Thanks @dmurph for the feedback and @inexorabletash for the comment! I didn't know about the problems with keeping the transactions complete (and its potential difficulty of implementation). So sounds like these are the way to go for now:

Allow pages with IDB connection and keep track of the locked version, and evict when a database issues a request for a different version from the locked version
Keep blocking pages with open IDB transactions from entering back/forward cache

fergald commented 2 years ago

Is there a reason to spec that open txns must make the page unsalveagable? If some browser were to allow the page to go into BFCache and out again, why would we want that to be out of spec?

inexorabletash commented 2 years ago

As @dmurph outlined above, an open transaction is observable from other contexts and requires code to be running on the page and responding to events to either keep the transaction alive or allow the transaction to complete. I think those are contrary to the intentions of BFCache, per the principles, "anything that happens while the document is not fully active should be treated as if it never happened" (so a transaction shouldn't commit) and "The document must not retain exclusive access to shared resources" (so it shouldn't be observable)

If some browser were to allow the page to go into BFCache and out again, why would we want that to be out of spec?

In that case, we'd want to spec/align on that behavior where it's observably different, right? So we need a proposal detailing that behavior.

fergald commented 2 years ago

I'm missing something, how is an open txn observable? I'm not that familiar with the IDB API but I can't find how page2 could discover that page1 has an open readonly txn?

Why does a page need to respond to events to keep a txn alive? Again with a readonly txn, why would the txn be aborted just because a queued JS event was not processed? Or if a readwrite txn was open but no other page was attempting to access the DB, why do we need to do anything interesting?

It seems like (in the spec) the only time we should require an inactive page to become unsalvageable due to an open connection or txn is if another page performs an action that would block, fail or otherwise give a different result because of it. Any implementation is free to be far more aggressive in marking pages unsalveagable due to implementation difficulties.

inexorabletash commented 2 years ago

how page2 could discover that page1 has an open readonly txn?

If page1 is keeping a readonly transaction alive, and page2 initiates a readwrite transaction with an intersecting set of stores, that transaction will not be able to grab the lock and requests will not be processed until page1's transaction's completes or aborts.

Why does a page need to respond to events to keep a txn alive? Again with a readonly txn, why would the txn be aborted just because a queued JS event was not processed?

IDB's model is based on assuming that events are being processed. A context opens a transaction and issues a request. The transaction is kept alive while there are outstanding requests, so the transaction is alive. The request is completed and an event fires. If the context responds to the event with another request, the transaction stays alive. If the context does not respond to the event with another request, the transaction will attempt to commit.

If event processing were to be paused, then per IDB semantics the transaction would be kept alive and keep holding locks.

Or if a readwrite txn was open but no other page was attempting to access the DB, why do we need to do anything interesting?

That's potentially valid. Defining/implementing that behavior would be extremely subtle, though.

(While the bulk of the IDB spec has been rewritten in algorithmic style, the transaction lifecycle section is still mostly assertions about behavior and the processing model ends up being implied/emergent. A preliminary step is probably to rewrite it and get consensus that the new formulation describes all existing implementations.)

Even without anything else trying to access the stores, the page would still be holding backend resources indefinitely until the page is discarded. Again, this seems contrary to the principles mentioned above.

fergald commented 2 years ago

how page2 could discover that page1 has an open readonly txn?

If page1 is keeping a readonly transaction alive, and page2 initiates a readwrite transaction with an intersecting set of stores, that transaction will not be able to grab the lock and requests will not be processed until page1's transaction's completes or aborts.

The reason I gave that example was because I thought a readonly txn would not block a readwrite txn. This comment says that Chrome implemented snapshots and that readonly txns do not block new readwrite txns. I don't know if that rolled out but it seems like a reasonable thing to do and I see no reason to prevent it in the spec.

Why does a page need to respond to events to keep a txn alive? Again with a readonly txn, why would the txn be aborted just because a queued JS event was not processed?

IDB's model is based on assuming that events are being processed. A context opens a transaction and issues a request. The transaction is kept alive while there are outstanding requests, so the transaction is alive. The request is completed and an event fires. If the context responds to the event with another request, the transaction stays alive. If the context does not respond to the event with another request, the transaction will attempt to commit.

If event processing were to be paused, then per IDB semantics the transaction would be kept alive and keep holding locks.

In general for BFCache, it's fine for a page to go into BFCache and come back out and hold a lock that entire time as long as nothing else thought about the lock.

Or if a readwrite txn was open but no other page was attempting to access the DB, why do we need to do anything interesting?

That's potentially valid. Defining/implementing that behavior would be extremely subtle, though.

(While the bulk of the IDB spec has been rewritten in algorithmic style, the transaction lifecycle section is still mostly assertions about behavior and the processing model ends up being implied/emergent. A preliminary step is probably to rewrite it and get consensus that the new formulation describes all existing implementations.)

Can we just say that the page1 must be marked unsalvageable if

page2 opens the DB with a higher version
page2 starts any kind of write txn that overlaps with any of page1's write txns

This is what is required for correctness. Maybe I've missed some other ways that page2 can observe page1's existence.

Even without anything else trying to access the stores, the page would still be holding backend resources indefinitely until the page is discarded. Again, this seems contrary to the principles mentioned above.

This would seem to be up to implementers. They need to worry about resource usage. BFCached pages unavoidably hold resources (mostly memory but probably some file-handles, maybe some CPU). Implementers can be far more aggressive.

inexorabletash commented 2 years ago

The reason I gave that example was because I thought a readonly txn would not block a readwrite txn. This comment says that Chrome implemented snapshots and that readonly txns do not block new readwrite txns. I don't know if that rolled out but it seems like a reasonable thing to do and I see no reason to prevent it in the spec.

If you read that comment thread in more detail, it says that Chrome had implemented snapshots, but that this was proving to be a compatibility issue for developers so the implementation was changed and the spec made more restrictive. It's best to think of readonly transactions as holding shared locks over their scope, and readwrite transactions as holding exclusive locks over their scope.

Can we just say that the page1 must be marked unsalvageable if

page2 opens the DB with a higher version

Or attempts to delete the database.

page2 starts any kind of write txn that overlaps with any of page1's write txns

As noted above, there are more permutations. See https://w3c.github.io/IndexedDB/#transaction-scheduling

We might be able to get away with inserting something akin to "if a transaction could be started if transactions in non-fully-active contexts were aborted, then those transactions must be aborted and the contexts marked unsalvageable" or something.

(Note that IDB is usable from workers, so we can't just say "page", I'm not sure of the spec text for dealing with a page's tree of dedicated workers.)

Note the case where page A has a lock on [s1], page B is waiting for a lock on [s1, s2] and page C is waiting for a lock on [s2]. If page B were destroyed, page C's transaction should be able to start. But if A had a lock on [s1, s2] instead, destroying page B wouldn't unblock C. Presumably that makes whether page B can enter the BFCache observable, so we need to spec carefully here.

fergald commented 2 years ago

Thanks for the clarifications.

We might be able to get away with inserting something akin to "if a transaction could be started if transactions in non-fully-active contexts were aborted, then those transactions must be aborted and the contexts marked unsalvageable" or something.

This sounds right to me. Basically if you would block, don't block, make the other context unsalvageable. If there's a clean way to say that great. If it ends up being a lot of work and we go for something more conservative, it could be noted that it is overly-conservative. If we add WPTs we should prioritize the cases that are clearly problematic rather than the cases that are only ruled out because the spec is overly conservative.

lozy219 commented 1 year ago

Here is a brief summary of what has been recently implemented in Chrome.

Pages with open IndexedDB connections or transactions can be eligible for BFCache, but they may be evicted under certain conditions, including:

When another page attempts to create an IndexedDB connection with a higher version, which would require the BFCached page to respond to a versionchange event. In this case, the versionchange event is not sent to the (to be evicted) BFCached page.
If the BFCached page has created an IndexedDB transaction but it has not yet started, which may be caused by waiting for locks held by other transactions.
If the BFCached page has an open transaction that is holding locks and other transactions from other pages are still waiting for those locks.
If the cached page has an open transaction, and a new transaction created by another page is attempting to acquire some of the locks held by the BFCached page's transaction.

Some WPTs had been added under the wpt/IndexedDB folder, here are the original changes for references:

szewai commented 3 weeks ago

Hi @inexorabletash @lozy219, could you explain why a page must be evicted from cache? Can't the cached page just aborts existing IDB activities when it might block other pages?

The existing wpt test asserts page is evicted from cache and WebKit currently fails the check, but I don't see the eviction behavior specified anywhere. And from discussion above, it seems all we want to make sure it active page is not blocked on cached page, so I proposed a test change here: https://github.com/web-platform-tests/wpt/pull/47582. I verified it works in Chrome, Firefox and Safari. Please let me know if you have any concern.

w3c / IndexedDB

How should IDB connections and outstanding IDB transactions behave when a page enters back/forward cache? #381