WICG / bfcache-not-restored-reason

Other
18 stars 5 forks source link

Avoiding browser specific reasons #2

Open domenic opened 2 years ago

domenic commented 2 years ago

I think it would be best if we could specify all the potential bfcache blocking reasons, even if some browsers don't use all those reasons.

This would avoid a situation where, e.g., in Chrome BroadcastChannel is "x-broadcast channels", and in Safari it is `"x-BroadcastChannel". This would be frustrating for developers.

I think the way to do this would be to just have a very large list in the specification, which any browser could add to. But no browser should ship a new reason without first sending a specification pull request.

This would force the list of reasons to go through a small amount of discussion and consideration, like we do for all specifications. Which I think would be good. E.g. maybe we don't want to expose BackForwardCacheDisabledByCommandLine and BackForwardCacheDisabledByLowMemory separately, and instead we just want BackForwardCacheDisabled.

One key ingredient here might be a generic reason like "other" or "unknown" which browsers can use for cases that they think are not common enough to be worth giving specific information on.

fergald commented 2 years ago

Yes, I think we should standardize where possible. One slight concern is that we can have different granularities. E.g. there might be distinct reasons in one browser while they're all bundled into 1 in another. So maybe allow ${standardised_name} and ${standardised_name}-${vendor_specific}.

ErwinHofmanRV commented 6 months ago

Observed

Not sure if this fits here, but as I have been playing around with bfcache NRR from time to time (as part of our RUM), I've been experiencing that it has been changing quite a bit. And there doesn't seem to be a human readable doc to rule them all yet.

For example, in the beginning we had this: https://docs.google.com/spreadsheets/d/1li0po_ETJAIybpaSX5rW_lUN62upQhY0tH4pR5UPt60/edit#gid=0

Around Jan 26, I noticed anomalies with that document and what DevTools bfcache tester was saying. For example: WebVR -> WebXR and MediaDevicesDispatcherHost -> ContentMediaDevicesDispatcherHost

Just some background info as to how I -as someone that tried to debug bfcache issues- got around this:

As I'm lazy and just copy+pasted it from DevTools to search for in the Google Sheet, I couldn't find them right away. Additionally, if the list/table is going to be finalized somewhere in a final doc, would it be an idea to point to the mozilla docs for example? I know how/what to Google for next after seeing this in DevTools bfcache tester, but most others just starting out with debugging bfcache reasons might not know. MDN examples: WebXR: https://developer.mozilla.org/en-US/docs/Web/API/WebXR_Device_API MediaDevices: https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices (the code examples over there learned me what to search for within the website's source code for example, which resulted in the exact file/culprit 🥳

Question 1

Yesterday, I saw that our NRR functionality did not result in the amount of data that we expected as part of our RUM. First reason was that preventedBackForwardCache flag was removed, while we actually checked this flag before attempting to collect/report data to site owners.

I did not see signs in Chrome v122 nor 123 for this flag, nor in the spec. I did change our code into [undefined, 'yes'].includes( obj.preventedBackForwardCache ) just to be sure, but is it safe to assume that this flag is fully removed?

Question 2

Going from Chrome 122 to 123, I also noticed different reasons being used, probably rendering the above google spreadsheet useless (although it still somehow matches with DevTools bfcache tester is exposing)

For example:

reason-strings in v122 MainResourceHasCacheControlNoStore,JsNetworkRequestReceivedCacheControlNoStoreResource,internal-error,Unload handler

reason-strings in v123 unload-handler,masked,cache-control-no-store

I both tested the same page (homepage url of a webshop) and steps (clicking to a listing page, and then going back to the previous page and check NRR via performance.getEntriesByType('navigation')[0].notRestoredReasons.

Given the values, it does feel like v123 is following the specs (which would make sense for the latest browser version): https://html.spec.whatwg.org/multipage/nav-history-apis.html#nrr-details-reason

However, does this mean that site owners (and RUM solutions) get to see less-specific reasons? Additionally, I only see masked being part of the spec. So unload-handler,cache-control-no-store are not. Will these be removed, resulting in even more guessing work? Or is the spec not up to date yet and thus not containing for example the latter values?

I see that DevTools bfcache tester is still exposing full reasons (UnloadHandlerExistsInMainFrame, MainResourceHasCacheControlNoStore, JsNetworkRequestReceivedCacheControlNoStoreResource). So I guess my main questions are:

If this deserves its own issue (or should be asked somewhere else), do let me know :)

rubberyuzu commented 6 months ago

Thanks Erwin for the observation and the questions, and sorry for the confusion.

I marked this spreadsheet as obsolete. This section of the spec will be the source of the truth.

Question 1 Yes, I can confirm that from Chromium M123, Not Restored Reasons API no longer reports blocked or preventedBackForwardCache boolean. This can now be inferred by checking the reasons field - if reasons is empty, that document is not blocking, and if reasons has any contents, that document is blocking. The only exception/complication here is the cross-origin iframes. Even when cross-origin iframes are blocking, their reasons field will be null for privacy reasons. Instead, masked reason will show up in the outermost main frame.

Question 2

However, does this mean that site owners (and RUM solutions) get to see less-specific reasons?

In a way, yes. Some of the reasons are reported as "masked", because they are not actionable and are derived from Chromium's implementation details.

Additionally, I only see masked being part of the spec. So unload-handler,cache-control-no-store are not. Will these be removed, resulting in even more guessing work? Or is the spec not up to date yet and thus not containing for example the latter values?

The spec is not yet complete, and the list currently only has the reasons that must block bfcache. In reality, user agents can choose to block bfcache with their own reasons. This is the follow-up PR which is trying to cover those may-block reasons.

So you may see more "masked" reasons reported with M123 and onward, but "masked" normally means that there's nothing the web developers can do about it (either because of browser's implementation details or cross-origin iframes). The API will continue to report the actionable reasons (50+ reasons in total), so hopefully the API will still give useful information.

The difference between NRR and the DevTools is not intentional, and I will try to use the same strings for the same reasons. One difference that will stay is that the DevTools will continue to report reasons that come from cross-origin iframes, while NRR API will does not report them.

ErwinHofmanRV commented 6 months ago

Thanks for the clarification. Especially this part is good to know:

Even when cross-origin iframes are blocking, their reasons field will be null for privacy reasons. Instead, masked reason will show up in the outermost main frame.

Additionally:

Some of the reasons are reported as "masked", because they are not actionable and are derived from Chromium's implementation details.

If it's an iframe that a user implemented themselves (as part ofe/embedded by a third parties' JS for example), then it actually is an actionable item: removing (or only loading on demand) that third party, such as a chat widget. But when reported as masked, it becomes a blindspot.

Is my theory (cross origin third party iframe being communicated in NRR as masked) correct? If so, could that be changed so that we (as a RUM) can at least let site owners know there are actionable issues, although not fully exposed due to privacy reasons (as explained by @fergald in https://github.com/WICG/bfcache-not-restored-reason/issues/5#issuecomment-2011075500).

If I'm missing a detail for this scenario, then do let me know as well :)

tunetheweb commented 6 months ago

This can now be inferred by checking the reasons field - if reasons is empty, that document is not blocking, and if reasons has any contents, that document is blocking. The only exception/complication here is the cross-origin iframes. Even when cross-origin iframes are blocking, their reasons field will be null for privacy reasons. Instead, masked reason will show up in the outermost main frame.

This is not quite true as you also need to recursively call all the children to check this.

For example, this sample site (which includes a same-origin iframe with an unload handler)

You end up with this:

{
    "src": null,
    "id": null,
    "url": "https://www.tunetheweb.com/experiments/bfcache/index3.html",
    "name": null,
    "reasons": [],
    "children": [
        {
            "src": "./index.html",
            "id": "myiframeid",
            "url": "https://www.tunetheweb.com/experiments/bfcache/",
            "name": "myiframe",
            "reasons": [
                {
                    "reason": "unload-handler"
                }
            ],
            "children": []
        }
    ]
}

So there is no indication at the top level that the block happened.

However, if you check this example site which contains an same-origin iframe without any reason to block the bfcache, notRestoredReasons is set to null (as opposed to undefined when the API is not supported).

So, is this right:

This can now be inferred by checking the reasons field - if reasons is empty, that document is not blocking, and if reasons has any contents, that document is blocking.

Or, is the mere presence of a non-null and also non-undefined notRestoredReasons value sufficient to know it was blocked?

tunetheweb commented 6 months ago

However, if you check this example site which contains an same-origin iframe without any reason to block the bfcache, notRestoredReasons is set to null (as opposed to undefined when the API is not supported).

That seems like a bug to me. As can't tell if failed to store, or not bfcache eligible. Raised: https://issues.chromium.org/issues/330729765

tunetheweb commented 6 months ago

Would still be nice to have an easy way of telling if there was a block without having to check the top level and iterate through any children.

rubberyuzu commented 6 months ago

That seems like a bug to me. As can't tell if failed to store, or not bfcache eligible.

This is actually an intentional behavior. When the page is restored from bfcache, you get null for performance.getEntriesByType('navigation')[0].notRestoredReasons. You also get null for non-history navigation.

If the page is not restored (i.e. any block happens), you get non-null value. (undefined if the API is not available but that will be gone when the feature is launched)

Or, is the mere presence of a non-null and also non-undefined notRestoredReasons value sufficient to know it was blocked?

This is true. If you get non-null & non-undefined NRR value, that means the page is not restored from bfcache (i.e. some block happened).


if (this is history navigation) {
  if (NRR reports undefined) {
    API is not supported.
  } else if (NRR reports null) {
    this is restored from bfcache.
  } else {
    this is not restored from bfcache.
  }
}
fergald commented 6 months ago

@tunetheweb we drop all frames from the tree that do not contribute to blocking. Every frame in the NRR tree either has blocking reasons or contains a descendant with a blocking reasons. If you can find instances where this is not the case, it's a bug.

rubberyuzu commented 6 months ago

Is my theory (cross origin third party iframe being communicated in NRR as masked) correct? If so, could that be changed so that we (as a RUM) can at least let site owners know there are actionable issues, although not fully exposed due to privacy reasons

You're right that cross-origin iframes' blocking can be actionable sometimes. However, we can't reveal that information through this API because of privacy issues. This is unfortunate and we're trying to come up with a way to expose some information. https://github.com/WICG/bfcache-not-restored-reason/issues/9 We also had an idea of choosing one cross-origin iframe randomly and reveal its blocking status. But we haven't reached the agreement with other vendors about this yet.

What you can do now as a developer is test the page on DevTools and try removing cross-origin iframes.

fergald commented 6 months ago

@tunetheweb we drop all frames from the tree that do not contribute to blocking. Every frame in the NRR tree either has blocking reasons or contains a descendant with a blocking reasons. If you can find instances where this is not the case, it's a bug.

I'm incorrect. I thought that was what we did and I think it's what we should do (since I see no point in keeping all those empty frames around).

tunetheweb commented 6 months ago

So there's still no indicator on the navigation entry if the bfcache was used. Just if it was not used? (From it can be somewhat inferred if it was used, but pretty unreliably with lots of edge cases).

At the moment we have this text in the article:

For PerformanceNavigationTiming objects that do not represent history navigations, the notRestoredReasons property will return null. This is useful for determining whether bfcache is not relevant to a particular navigation, as opposed to notRestoredReasons not being supported in which case it would return undefined.

But guess that's not the case anymore so will remove that. BIt of a shame there's no reliable indicator and you still have to depend on event.persisted for this.

I'm also seeing weirdness where it doesn't update after being set. For example, navigate to https://web.dev, then www.example.com, then hit back to return to https://web.dev. Go make a cup of tea and wait for a bit. Navigate forwards to https://www.example.com and you get a notRestoredReason of "timeout". So far so good. Now navigate to another site (e.g. developer.chrome.com). I would expect that to put www.example.com into the bfcache, but when I go back I still have a notRestoredReason of "timeout". Is that expected?

rubberyuzu commented 6 months ago

I would expect that to put www.example.com into the bfcache, but when I go back I still have a notRestoredReason of "timeout". Is that expected?

I was able to reproduce this bug. Let me file a bug and address this. Thanks!

rubberyuzu commented 6 months ago

Sorry, I was incorrect. This is an expected behavior. NotRestoredReasons are only built when unloading a document - e.g. only for non-bfcache history navigation. (spec) So if the next navigation is restored from bfcache, NotRestoredReasons are never updated, and the reasons are supposed to stay.

We have a wpt for this too. https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/web_tests/external/wpt/performance-timeline/not-restored-reasons/performance-navigation-timing-bfcache-reasons-stay.tentative.window.js

tunetheweb commented 6 months ago

Moving discussion of this to #17 as it's drifted far away from the original reason (my fault!).