Closed smaug---- closed 2 years ago
So the new model would be that a top/auxiliary?-level browsing context has history. Each item in that history consists of a tree representing the browsing contexts in play. Each node in that tree holds what we call a session history entry today. Whenever navigation happens, we add items to history with fresh nodes, potentially pointing to the same session history entries.
Each item in that history consists of a tree representing the browsing contexts in play.
at the time the relevant item was created. Nested browsing contexts may get deleted, and recreated when going back. Then when recreating, the url from entry is used for the iframe.
Tricky part here is dynamic changes. Gecko tries to not load anything from session history to browsing contexts which were originally dynamically (==JS) created and then recreated. I think IE/Edge does something similar, at least it seems to behave that way.
(I need to look a bit how this was implemented. Gecko certainly removes duplicate history entry trees which can be created when dealing with dynamically added/removed entries.)
IIRC Presto behaved similarly to IE/Edge/Gecko, except that it wasn't removing duplicate trees so in some cases user had to press 'back' several times.
https://arxiv.org/abs/1608.05444 might be relevant. I haven't yet read it but here's the abstract:
Navigation has been a core component of the web since its inception: users and scripts can follow hyperlinks, and can go back or forwards through the navigation history. In this paper, we present a formal model aligned with the WHATWG specification of navigation history, and investigate its properties. The fundamental property of navigation history is that traversing the history by delta then by delta' should be the same as traversing by delta+delta'. In particular, traversing by +1 (forward) then by -1 (back) is the same as traversing by 0 (doing nothing). We show that the specification-aligned model does not satisfy this property, by exhibiting a series of counter-examples, which motivate four patches to the model. We present a series of experiments, showing that browsers are inconsistent in their implementation of navigation history, but that their behaviour is closer to the patched model than to the specification-aligned model. We propose patches to the specification to align it with the patched model.
cc @asajeffrey and @connorgbrewster, since they're putting together suggestions for how to improve the session history specification.
Oh, speak of the devil. I didn't realize they published their suggestions to arXiv.
I will take on the task of reading the paper and understanding it well enough to be comfortable applying their patch (or raising any questions I have). So assigning to me. /cc @majido as well.
@domenic thank you! If you have any questions, feel free to ask me, @asajeffrey is on PTO for the next couple of weeks
The paper has some attempt at wording that would bring the spec in line with the model that is implemented in Servo and verified in Agda. I'm not completely in love with the wording, I'm sure it could do with improvement!
On Mon, Aug 22, 2016 at 11:53 AM, Connor Brewster notifications@github.com wrote:
@domenic https://github.com/domenic thank you! If you have any questions, feel free to ask me, @asajeffrey https://github.com/asajeffrey is on PTO for the next couple of weeks
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/whatwg/html/issues/1454#issuecomment-241476920, or mute the thread https://github.com/notifications/unsubscribe-auth/AAYnhSPxJrwDHqF_Jc_0PKHJbJY7pJhZks5qidQSgaJpZM4I7SJG .
OK, I read it! I understand the problems with the current model, which are illustrated really well. I'm glad that browsers mostly seem to follow the patched model, too. Here are some questions that came up for me:
history.length
property ("must return the number of entries in the top-level browsing context's joint session history"). In the patched model, what is the correct way to do this?Overall, this is really lovely work! I'm looking forward to integrating it :D
Thank you for reading over it! Hopefully I can clear up some of those questions.
"Moreover, performing these experiments shows some unexpected behaviours in browser implementations." I felt like this did not get enough information. It sounds like there are scenarios where browsers do not align with the patched model. Do they align with the unpatched model? Do they agree with each other? I would love to see these scenarios investigated in the same amount of detail as counterexamples 1-5.
If you look at the table in section 3 it shows which browsers align/don't align with the patched model. Most browsers do except IE/Edge, and in that case they seem to align to the current spec. Safari does have some issues with pushState
/replaceState
not behaving the same as regular navigations; however, I believe that is an issue in Safari rather than with the intended implementation/spec. I believe I have a few more examples that did not end up in the final document, let me know if you would like to see those.
One of the reasons we have the "joint session history" concept is so we can return a correct value for the history.length property ("must return the number of entries in the top-level browsing context's joint session history"). In the patched model, what is the correct way to do this?
More generally, is it important to remove the joint session history concept entirely, or did you just find it not useful? I think it would be more comprehensible to have a general concept of joint session history, with a current entry, a joint session future, and a joint session past. Is that workable, or is such a concept fundamentally flawed?
We do not really intend to get rid of the idea of the "joint session history", but rather, like you said, to split it into the joint session future
, joint session past
, and the current entry
. In fact, this is exactly how our patched model is implemented in Servo right now. As for calculating history.length
we sum up all of the entries in the joint session past
and joint session future
and then add 1
to account for the current entry (note this is for all browsing contexts, not for each one individually).
Counterexample 4-5 describe how navigation needs to delete the joint session future. I believe this involves patching https://html.spec.whatwg.org/multipage/#update-the-session-history-with-the-new-page step 2, "otherwise", step 1, but the paper did not note it. Did I get that right? (We also have a bug in this area... #1578. This "entry update" business seems not great.)
Indeed, in fact I think there are multiple places that say to Remove all the entries in the browsing context's session history after the current entry.
I believe all of these will need to be updated to remove all of the entries in the joint session future
.
Introducing terms that are meaningful only within the scope of an algorithm is a bit strange ("become active", "stay active", "activating", "fully activating"). Do you have any thoughts on ways of rearranging the uses of those terms to be less specific to a single run of the algorithm? It's fine if not; we can live with it.
This may be something @asajeffrey could answer better, but I am all for reducing the amount of new terms added.
The loading behavior encapsulated in "should"s in steps 6.1.4 and 6.1.5 was a bit confusing to me, possibly because it is using the normative word "should". I think what this meant was that since step 6.1.4 simply sets the current entry of the session history, no loading will take place, whereas 6.1.5 will perform loading, since it uses "traverse the history"?
Perhaps loading it the wrong word here? I think activating would be a better term (correct me if I am wrong @asajeffrey) although it does sound a bit redundant.
Do you think it would be better to, instead of performing a traversal for each entry in the entry sequence, perform a traversal for each unique browsing context found in the entries of the entry sequence? That seems more intuitive to me than doing several traversals per browsing context with only one of them actually causing a load.
If the browsing context does not need to traverse, a load (activation?) should not happen. Only if the current entry
of the browsing context is changed will the new document be activated. This could happen on multiple browsing contexts during a traversal since the joint session history interleaves each browsing context's entries.
I hope these answers are helpful! Let me know if I can clear anything else up or if you have any questions on my responses here.
If you look at the table in section 3 it shows which browsers align/don't align with the patched model. Most browsers do except IE/Edge, and in that case they seem to align to the current spec. Safari does have some issues with pushState/replaceState not behaving the same as regular navigations; however, I believe that is an issue in Safari rather than with the intended implementation/spec. I believe I have a few more examples that did not end up in the final document, let me know if you would like to see those.
What I was referring to was the additional unexpected behaviors. pushState in Safari is an interesting one, although it sounds like that is a Safari-specific problem? But you also showed an example where Firefox does not appear to follow the patched model. When you ran that test case in other browsers, did they follow the patched model, or did they follow Firefox?
My concern is that the experiments performed for counterexamples 1-5 might not have full coverage of history traversal behavior divergences in browsers. If we find that there's some uncovered situation which all browsers agree on, but the model disagrees on, we'd need to adjust the model.
Perhaps loading it the wrong word here? I think activating would be a better term (correct me if I am wrong @asajeffrey) although it does sound a bit redundant.
If the browsing context does not need to traverse, a load (activation?) should not happen. Only if the current entry of the browsing context is changed will the new document be activated. This could happen on multiple browsing contexts during a traversal since the joint session history interleaves each browsing context's entries.
I think load is correct, albeit a bit imprecise; this is referring, I believe, to how traverse the history performs navigation in its step 1. It later marks the document as active in its step 4.3.
However, changing the current entry does not necessarily make the document active. Your proposed 6.1.4 changes the current entry, but does not invoke traverse the history, so the document never becomes active. Only 6.1.5 will make the document (eventually, after navigation) active.
It looks like this is a bug? (Edit: see https://github.com/whatwg/html/issues/1454#issuecomment-242263937; probably we should just fix the spec to make sure these are always in sync.)
Consider the diagram at the bottom of page 15, and consider traversing by +4. From what I understand, the algorithm will have an entry sequence of 3, 4, 5, 6; if we label the browsing contexts (from top to bottom) as A, B, C, then 3 and 4 are in B and 5 and 6 are in C. Step 6.1 will execute 4 times:
The double-unload bug, plus general clarity (e.g. steps 6.1.2 being related only to browsing contexts, not entries) is why I am suggesting that maybe instead of looping through 4 times, we should loop through 2 times. I.e., write spec text so that something like this happens:
As I said in the initial comment, one thing to remember here is that also dynamic changes to the browsing context tree should be defined somehow - how such affect to session history. webkit-based (so including also blink) don't really handle those in any sane way and others are somewhat reasonable.
I haven't yet read https://arxiv.org/abs/1608.05444 properly and but it doesn't seem tackle that issue.
And about joint history. Currently per spec it is bound to browsing contexts, but that is not what browsers do. session history is bound to top level browsing context only, since all the other contexts come and go, yet the history usually stays alive. In the paper I don't see examples where one does session history load between top level pages which containing iframes: first load some top level page, then in create more session history entries by loading pages in some its iframes, then do a top level page load to another page containing different set of sub browsing context and there do loads in iframes and then go back to some state in the previous top level page + certain page in iframe. One hard part is to map the existing session history tree to the browsing context tree, since browsing context tree doesn't stay alive (except in bfcache case)
It looks like this is a bug?
My above analysis is a bit flawed because I seem to have forgotten what fully active meant. There is still a bug, I think: the paper assumes that simply updating the current entry will change what document is considered active, while the spec seems to do explicit updates (currently inside traverse the history). Probably we should change the spec to redefine active document to simply be the document of the current entry.
Let me edit the above, as this changes which of steps 6.1.4 vs. 6.1.5 execute. I think idea about restructuring the algorithm to be per-browsing context instead of per entry in the entry sequence still stands.
@smaug----
As I said in the initial comment, one thing to remember here is that also dynamic changes to the browsing context tree should be defined somehow - how such affect to session history. webkit-based (so including also blink) don't really handle those in any sane way and others are somewhat reasonable.
What do you see as wrong with the spec's current definition? I can imagine there are bugs in WebKit and Blink, as the paper points out, but I am curious what you see wrong with the spec.
And about joint history. Currently per spec it is bound to browsing contexts, but that is not what browsers do. session history is bound to top level browsing context only, since all the other contexts come and go, yet the history usually stays alive.
That's not true; joint session history is bound to a top-level browsing context by the spec. Each browsing context has a session history which contributes to the joint session history, and as you say, those come and go as the browsing contexts do.
In the paper I don't see examples where one does session history load between top level pages which containing iframes: first load some top level page, then in create more session history entries by loading pages in some its iframes, then do a top level page load to another page containing different set of sub browsing context and there do loads in iframes and then go back to some state in the previous top level page + certain page in iframe.
Counterexample 2 appears to be an example of this.
One hard part is to map the existing session history tree to the browsing context tree, since browsing context tree doesn't stay alive (except in bfcache case)
I'm not sure what you mean by this, but in the spec, I think the model is that the browsing context tree stays alive until you close the tab/remove the iframe/etc. Documents do not necessarily stay alive, and the spec talks in detail about how to replace them once they've disappeared.
What do you see as wrong with the spec's current definition? I can imagine there are bugs in WebKit and Blink, as the paper points out, but I am curious what you see wrong with the spec.
The spec doesn't define this at all. But this is related to the following too:
I'm not sure what you mean by this, but in the spec, I think the model is that the browsing context tree stays alive until you close the tab/remove the iframe/etc.
What in the spec keeps the sub-browsing contexts alive when one navigates away form a page containing iframes? (unless the page goes to the bfcache, browsers do destroy the document tree and remove iframes from document obviously, and destroy sub-browsing contexts.)
That's not true; joint session history is bound to a top-level browsing context by the spec. Each browsing context has a session history which contributes to the joint session history, and as you say, those come and go as the browsing contexts do.
I mean, sub-browsing context don't really have any session history. It is all handled in the top level one. There is no "joint". There is just one session history, which is list of trees and handled by the top level browsing context.
The spec doesn't define this at all.
Why do you say that? It all seems defined to me. Can you outline a scenario that you don't think the spec covers?
What in the spec keeps the sub-browsing contexts alive when one navigates away form a page containing iframes?
Ah, I think I see the problem here. There should be a statement in https://html.spec.whatwg.org/#garbage-collection-and-browsing-contexts saying that a browsing context has a strong reference to its children. (Maybe there is an indirect one I am missing.)
I mean, sub-browsing context don't really have any session history. It is all handled in the top level one. There is no "joint". There is just one session history, which is list of trees and handled by the top level browsing context.
I believe what you're describing is a model that is isomorphic to the spec's model. However, given the garbage collection confusion above, I can see how your model might be more straightforward, as then we could probably let the UA garbage-collect navigated-away-from child browsing contexts.
Anyway, still looking forward to hearing from @ConnorGBrewster and @asajeffrey on a reply to my https://github.com/whatwg/html/issues/1454#issuecomment-242177806
I can see how your model
"your model", the model browsers actually implement ;)
Consider the diagram at the bottom of page 15, and consider traversing by +4. From what I understand, the algorithm will have an entry sequence of 3, 4, 5, 6; if we label the browsing contexts (from top to bottom) as A, B, C, then 3 and 4 are in B and 5 and 6 are in C. Step 6.1 will execute 4 times:
I see the issue here, and our suggestions do need to be edited here. Basically we don't want to:
If the browsing context is going to change its active document, it should prompt to unload and unload the current document only once and then activate the new specified entry (skipping any documents between the current active document and the last one that needs to be activated). I hope this makes sense.
About the frame tree changing as documents are unloaded and parts of the frame tree are removed, this is something we are currently trying to figure out the best way to do. In Servo, we don't ever unload any documents that are still in the session history, but we are working on adding this. In our Servo implementation we also recalculate the joint session history whenever a traversal is requested or whenever history length is requested. This means that if parts of the frame tree are removed, it will not cause any issues with the joint session history.
Right, if browsing context tree was there all the time, this all would be quite a bit simpler. But it can't be there, that would use tons of memory.
When we have session history entry tree (well, list of them) separate from browsing context tree, and going back to a page which had iframe, mapping session history entries to iframes is something to define. And if each time a page is loaded, different iframes are created, that could lead to rather odd bugs.
Gecko used to have https://bugzilla.mozilla.org/show_bug.cgi?id=462076 (and I'm pretty sure that bug is still in all the browsers if server side does something nasty to iframes). http://mozilla.pettay.fi/moztests/history6/index-disable-bfcache.html is an example of that case. Load the page and then load some other top level page, and press back. One can't really use the entries from session history in that case, and Gecko at least specially marks 'dynamic' (== JS) iframe additions/removals, and just does normal load for those iframes. http://mozilla.pettay.fi/moztests/history2/Start.html shows that webkit/blink do something different with dynamic changes than Gecko/IE/Presto, since webkit/blink allow loading session history entries to iframes which never had such url loaded. (though, as I said, I wouldn't be surprised if using server side stuff, one could see similar bug also in non-webkit/blink engines)
@smaug---- this makes sense. Like I was saying, with the Servo implementation we can remove entire chunks of the session history out without any issues. This comes in handy in scenarios where pages that contained iframes are removed from memory, the iframe session history will also be removed. I am not sure if this is something that should be defined by the spec or if it should be up to the UA how to handle it.
Back from vacation now, reading through this thread. I should have something more concrete to say tomorrow...
FWIW, I agree that the problem @smaug raised in https://github.com/whatwg/html/issues/1454#issuecomment-242215425 (how to map the existing session history tree to the browsing context tree) is 1) of practical importance and 2) hard to get right.
The problem is of practical importance, because it seems rather impractical (in terms of memory footprint) to keep “sub-browsing contexts alive when one navigates away from a page containing iframes”. Additional scenario to consider is streamlining the user experience of updating a browser to apply latest security fixes - if a browser is not able to restore (some) browsing contexts then the user might be incentivised to keep running an old, vulnerable version of the browser until forced to update/reboot at a later point. In both of these example scenarios the old browsing contexts are gone (because of memory pressure and/or because of browser restart during the update process) and therefore as @smaug notes the browser needs to somehow map the session history tree to the (new, freshly loaded) browsing context tree. A simple to follow example that illustrates the need for the mapping (assuming discarded old browsing context tree) can be found at https://bugs.chromium.org/p/chromium/issues/detail?id=500260#c20.
AFAIK, WebKit and Blink give each frame a “unique name” derived from 1) browsing context name (aka window.name) if present and otherwise 2) use the position of the frame in the frame tree (+ some extra processing to ensure no 2 frames have the same unique names). When restoring session history, “unique name” is used to map 1) frames in the freshly loaded browsing context tree to 2) session history entries.
One nice property of “unique name” is that it is stable across page loads, as long as the web page author keeps using the same window.name (i.e. “unique name” won’t change if the frame’s classname or id attributes change, or if another frame is inserted into or deleted from the DOM) - this increases the likelihood that session restore will succeed even if the html document has been slightly tweaked since the history entry was generated (it is probably important to note here that some users keep their browser opened for a long time - this means having to deal with session history entries that might have been created multiple days or weeks ago). Another nice property of “unique name” is that it is able to identify dynamically created frames (I was a little surprised to learn that “Gecko tries to not load anything from session history to browsing contexts which were originally dynamically (==JS) created and then recreated.”).
I am not very familiar with Gecko, but I hear that “Gecko keeps the session history for iframe in the previous top level pages when the iframe hasn't been added using JS, in other words parser created iframes. Identifying such "static" iframes is based on their location in DOM.” (based on https://bugs.chromium.org/p/chromium/issues/detail?id=500260#c26).
The problem is hard, because (AFAICT) tracking frame identity across page reauthorings and reloads is not possible for all cases and it will always boil down to a set of heuristics. Still, having some guidance in the spec would go a long way toward 1) unifying the heuristics used across different browsers and 2) helping web developers make sure that their web pages are compatible with these heuristics (e.g. suggesting that session history might not work if a frame is not named OR if a frame is dynamically created). Some specific examples of the heuristics 1) breaking down and/or 2) disagreeing across browsers, can be found in https://bugs.chromium.org/p/chromium/issues/detail?id=500260#c24
The problem is hard, because (AFAICT) tracking frame identity across page reauthorings and reloads is not possible for all cases and it will always boil down to a set of heuristics.
I wouldn't say it is necessarily heuristics. The model Gecko for example has is quite simple and has worked reasonably well for 6 years. And the session restore can still restore session history quite well after restarting the browser. Also, the model doesn't suffer from the same issues as webkit/blink (mentioned in initial comment). I've asked feedback from MS about the session history implementation in IE/Edge, since traditionally IE has had the least buggy implementation, at least in my tests, and that is why I tried to make Gecko to behave somewhat similarly to IE (6 years ago). But of course, there are cases where IE/Edge and Gecko do behave differently.
Hmm. I see what you mean about document unloading/reloading, this was something we brushed under the carpet in the arxiv paper. A simplistic approach would be on unloading just to remember the frame tree, then treat reloading as losing all the session history for all the iframes in the document, and don't try to recreate frame identity. This is at least simple, although it has odd effects like making traversing the session history change the jsh length.
IRC conversation with @domenic: http://logs.glob.uno/?c=freenode%23whatwg&s=21+Dec+2016&e=21+Dec+2016#c1015263
The discussion was about the timestamp used implicitly in https://html.spec.whatwg.org/multipage/browsers.html#joint-session-history where it says:
Entries in the joint session history are ordered chronologically by the time they were added to their respective session histories.
When a document is reloaded (for example by it being navigated away from, then discarded, then traversed to) the timestamp shouldn't be updated, otherwise we can get into a situation where the joint session history order disagrees with the session history order.
To push this forward, I'm trying to figure out the behavioral differences between browsers. Could folks from Edge / Chrome / Safari provide me some comments? https://docs.google.com/document/d/1PDMf2NEaxf94ap26SREcM3DaZjk6dvq0jfsg983SPx0/edit?usp=sharing
@travisleithead @hober @csreis hey, it'd be really great to get feedback on https://github.com/whatwg/html/issues/1454#issuecomment-341032421. We'd really like to work on a more interoperable history story (and also test suite) and @freesamael is prepared to work on that, but needs some guidance on direction.
An update is that @cbrewster reimplemented Servo's session history to be a list-of-tree-diffs, which might be a good way to bridge the spec and implementations.
The gore is https://github.com/servo/servo/commit/f3d068f58343ff655d28bfa9080e98b5cbf25d5f, but the idea is that each top-level browsing context stores a list of diffs, where each diff is a triple of the browsing context id, its old value, and its new value. Each value is either the load data (for documents which have been evicted from the bfcache) or a document id (for documents which are still in the bfcache).
The implementation doesn't handle pushState
or replaceState
yet, hopefully that's "just" a new kind of diff.
Can you say more about the difference between that model and the model from your previous paper?
I also asked @smaug---- a while back for web platform tests which indicate the divergence between the spec and implementations (or between the spec and the arXiv paper). @smaug----, have you made any progress on that? If so, we may be able to spend some time as an intern project this summer fixing the spec.
I haven't had time for writing tests. I will need to write something for shadow DOM, even though they will be hypothetical in the sense that iframe handling in shadow DOM is largely unspec'ed.
FWIW, Gecko does internally sort-of look for diffs so that it can remove duplicated entries in the list-of-trees when some iframes have been removed by scripts. But not sure that is similar diff calculation to Servo's.
The previous model, from the paper, had each browsing context store its own session history entries. The joint session history would take all descendant browsing contexts and merge the session histories together (using a timestamp on each entry). This method worked well because it could easily handle changes to the tree (i.e. removing an iframe) as the joint session history is recalculated lazily every time it is needed. Unfortunately, this method was complex to implement and was making the implementation of the History state API very difficult and needlessly complex.
The new method is very similar to the list-of-trees method that Gecko uses (explained in https://docs.google.com/document/d/1PDMf2NEaxf94ap26SREcM3DaZjk6dvq0jfsg983SPx0/edit). Except we only store the currently active tree and a list of diffs to move between trees to either traverse forward or backward in the joint session history. With this method, each browsing context no longer has its own session history, there is only a single joint session history per top-level browsing context.
The advantage of this method is we don't need to store entire trees and in the case of iframe removals, we only need to look through the diffs that applied to the iframe's browsing context and remove them.
I have not implemented the History state API with this new session history implementation yet, but in theory this would add a new kind of diff to the tree that changed the active history state.
What I'm hoping is that the model is the same, and this is an implementation strategy for it. One nice thing is that this implementation bridges the gap to the list-of-trees model, but doesn't require any deep cloning of trees.
One interesting thing that came out of this is a test @cbrewster wrote https://github.com/cbrewster/servo/blob/787ec4b2093ca1402537e04df572984808e5fa10/tests/wpt/web-platform-tests/html/browsers/history/joint-session-history/joint-session-history-remove-iframe.html which removes an iframe from its parent and checks to see if this immediately updates history.length
. AFAICT the spec says it should, and FF agrees, but Chrome and Safari don't.
At least Gecko and Blink use a list of trees to represent session history, not a joint list of session histories in browsing contexts (since as far as I see, the spec way to define this all can't handle iframes in the unloaded, yet in session history, pages properly).
http://searchfox.org/mozilla-central/source/docshell/shistory/nsISHTransaction.idl https://cs.chromium.org/chromium/src/content/renderer/history_controller.h?l=59
Dynamic changes to the dom where iframes are added or removed are still a tricky case. Gecko and IE/Edge might handle them a bit better than Blink/webkit as http://mozilla.pettay.fi/moztests/history2/Start.html hints (blink ends up loading an url from session history to an iframe which never had that url loaded before).