WICG / scroll-to-text-fragment

Proposal to allow specifying a text snippet in a URL fragment
Other
586 stars 43 forks source link

The spec should be more explicit about how text directives interact with history #217

Closed bokand closed 1 year ago

bokand commented 1 year ago

Currently, history entries do include the fragment directive portion of a fragment, the stripping algorithm is applied only when the Document is made active. The history entry's URL remains unstripped. This means back/forward navigation includes navigations to a changed fragment directive (e.g. navigating between highlights within a single document).

This seemed like the right behavior at the time but it's worth discussing the edge cases. We should also be more explicit about this in spec-text, at least with examples. Some things to consider:

annevk commented 1 year ago

cc @jakearchibald @domenic

annevk commented 1 year ago

Would it make sense to continue stripping it from the URL's fragment, but store the state in the session history entry, similar to scroll restoration mode?

So it would impact reload, the back button, etc., similar to normal fragments, but the state is kept to the side for consistency.

jakearchibald commented 1 year ago

@bokand

This means back/forward navigation includes navigations to a changed fragment directive (e.g. navigating between highlights within a single document).

That doesn't seem to be how implementations work. I'm not familiar with this feature, so I don't know if I'm seeing browser bugs or intended behaviour.

Test case:

https://output.jsbin.com/sejerip/2/quiet

  1. Click "hello highlight"
    • Chrome: Hello highlights, hashchange, https://output.jsbin.com/sejerip/2/quiet
    • Safari: Hello highlights with animation, hashchange, https://output.jsbin.com/sejerip/2/quiet# (note URL difference)
  2. Click "world highlight"
    • Chrome: World highlights, hello still highlighted, hashchange, https://output.jsbin.com/sejerip/2/quiet
    • Safari: World highlights with animation, hello still highlighted, hashchange, https://output.jsbin.com/sejerip/2/quiet#
  3. Traverse back
    • Chrome: Both still highlighted, hashchange, https://output.jsbin.com/sejerip/2/quiet
    • Safari: World highlight animation replays(??), hello still highlighted, hashchange, https://output.jsbin.com/sejerip/2/quiet#
  4. Click "#foo"
    • Chrome: Both still highlighted, hashchange, https://output.jsbin.com/sejerip/2/quiet#foo
    • Safari: Both still highlighted, hashchange, https://output.jsbin.com/sejerip/2/quiet#foo
  5. Traverse back
    • Chrome: Both still highlighted, hashchange, https://output.jsbin.com/sejerip/2/quiet
    • Safari: Both still highlighted (no animations replay), hashchange, https://output.jsbin.com/sejerip/2/quiet#
  6. Reload
    • Chrome: Nothing highlighted, https://output.jsbin.com/sejerip/2/quiet
    • Safari: Nothing highlighted, https://output.jsbin.com/sejerip/2/quiet (note the URL change)

Another test:

  1. Click "hello highlight"
    • Chrome: Hello highlights, hashchange, https://output.jsbin.com/sejerip/2/quiet
    • Safari: Hello highlights with animation, hashchange, https://output.jsbin.com/sejerip/2/quiet# (note URL difference)
  2. Click "world highlight"
    • Chrome: World highlights, hello still highlighted, hashchange, https://output.jsbin.com/sejerip/2/quiet
    • Safari: World highlights with animation, hello still highlighted, hashchange, https://output.jsbin.com/sejerip/2/quiet#
  3. Reload
    • Chrome: Nothing highlighted, https://output.jsbin.com/sejerip/2/quiet
    • Safari: World highlighted, https://output.jsbin.com/sejerip/2/quiet

Soooo there's a lot of inconsistency and bugs here.


Firstly, I don't think hashchange should fire if the fragment hasn't changed in terms of what location sees (spec).

In terms of the rest, there are three places we could put the data, and each behaves differently:

I don't think we should store the state in the URL, although this would be equivalent to "history entry", just more hacky.

Document

This seems to match Chrome's current behaviour. Although, I question why these navigations create new history entries when there's no difference between the two history entries.

Document state

Again, I'd question why these navigations create new history entries when there's no difference between the two history entries.

History entry

When navigating to "#foo" in either a push/replace, it could copy the highlight state from the previous/to-replace entry. I think that's similar to how the navigation API passes its state around.


Which to go with depends on the desired behaviour.

Putting the state on history entry seems best, and I agree with @annevk's comparison to scroll state. Although, it's pretty different to current implementations.

jakearchibald commented 1 year ago

One thing to think about is what to do with failed matches. Eg:

  1. UA navigates to #:~:text=hello, but there's no match for "hello" in the document.
  2. UA navigates to #:~:text=world, there's no match for "world", but there is now a match for "hello".

Or:

  1. UA navigates to #:~:text=hello, but there's no match for "hello" in the document.
  2. UA navigates to #foo, and there is now a match for "hello" (but both history entries have the same highlight state).

Or:

  1. UA navigates to #:~:text=hello, but there's no match for "hello" in the document.
  2. UA navigates to #:~:text=world, and there's a match for "world" and "Hello".

Or:

  1. UA navigates to #:~:text=hello, but there's no match for "hello" in the document.
  2. UA navigates to #:~:text=world, and there's a match for "world" but not "hello".
  3. UA traverses back, and there's now a match for "hello".

Or:

  1. UA navigates to #:~:text=hello, and there's a match for "hello" in the document.
  2. UA navigates to #foo, and there is now an earlier match for "hello" in the document.

The behaviour that comes out would be answered by the following:

bokand commented 1 year ago

Thanks Jake for the thorough analysis! This is an area where I didn't think through the model in the detail I should have - better late than never though...

One thing that I think is missing in your demo is scroll. Chrome's behavior (bug?) is that it doesn't remove highlights when moving to a different text directive but it does scroll, if you move your "hello" and "world" <p>s to require scrolling, navigating back/forward will indeed scroll between them (though, I'd have to confirm in Chrome's case whether that's happening due to scroll state being restored or if we are actually re-processing the directive).

I've run out of time today but I'll give these cases a deeper think tomorrow.

jakearchibald commented 1 year ago

Ugh, I realised the scroll thing and intended to edit my comment. I guess I forgot. Yeah, it's just restoring scroll position rather than scrolling to the highlight. I think that's reasonable behaviour, but I'm still kinda surprised the highlight persists when going back.

annevk commented 1 year ago

I think the highlight makes sense. It's the same as if you went back to a fragment navigation and the page uses :target.

jakearchibald commented 1 year ago

Hmm, are you sure?

https://output.jsbin.com/sejerip/2/quiet - if you click "hello highlight" then "world highlight" you end up with both highlighted. If you then click "#foo", both are still highlighted. That isn't how :target works.

I'd prefer a model more like :target for consistency, but I assume there's some reason both implementations went for additive highlighting?

annevk commented 1 year ago

Oh. Hah! I'm not sure there's a good reason for the existing behavior. I agree with you.

bokand commented 1 year ago

https://output.jsbin.com/sejerip/2/quiet - if you click "hello highlight" then "world highlight" you end up with both highlighted. If you then click "#foo", both are still highlighted. That isn't how :target works.

The behavior differs if I change the text directive directly from the address bar - in that case the first highlight gets removed. Navigating through history it seems like each highlight gets 2-3 entries...and then the highlights are additive.

In Chrome's case, I believe the additive behavior wasn't intentional - I suspect the same for Safari.

Would it make sense to continue stripping it from the URL's fragment, but store the state in the session history entry, similar to scroll restoration mode?

sgtm, I'll try this approach, to summarize:

I believe this also means that we'll no longer fire hashchange for changes in the fragment directive, since that happens as part by comparing the URL on the history entries.

One thing to think about is what to do with failed matches. Eg:

This model wouldn't distinguish between failed/succeeded - the fragment directive is stored on an entry either way.

jakearchibald commented 1 year ago
  • Process the fragment directive when traversing history but don't cause scroll to the highlight. Scroll position will come from the saved scroll state (or avoided if scroll restoration is manual) - this matches regular fragment navigation behavior.
  • Processing a new fragment directive will remove existing highlights
  • Writing a new fragment will preserve/clone the current fragment directive (i.e. so navigating to #foo won't remove existing highlights, but navigating to #foo:~:differentdirective will)

One case to think about:

  1. Navigate to #:~:text=world (page does not contain 'world')
  2. Add 'world' to page.
  3. Navigate to #foo.
  4. pushState({}, '', '/hello').

It feels like the scroll-to behaviour should only happen during the navigation where the directive is stripped from the URL.

Also, what's scrolled-to when navigating to #foo:~:text=world?

jakearchibald commented 1 year ago

https://jsbin.com/vohimek/3/quiet - here's another case that's different between Chrome & Safari.

bokand commented 1 year ago
  1. Navigate to #:~:text=world (page does not contain 'world')
  2. Add 'world' to page.
  3. Navigate to #foo.
  4. pushState({}, '', '/hello').

Does 'world' highlight on step 3? Does it scroll to #foo or 'world'? Does it scroll to 'world' on step 4?

In the model I described, The history entries created at step 1, 3, and 4 all have "text=world" as the fragment directive but they will not cause a highlight or scroll to 'world' because step 1 performed the search and the fragment directive didn't change in any of the subsequent steps.

If you add

  1. Navigate to #bar:~:anything
  2. history.back()

This performs a new search for 'world' because the fragment directive changes from 'anything' to 'text=world'.

Also, what's scrolled-to when navigating to #foo:~:text=world

This will highlight and scroll to 'world' if it's found. If it's not found, it'll scroll to #foo

https://jsbin.com/vohimek/3/quiet - here's another case that's different between Chrome & Safari.

Thanks! Looks like Chrome paints the highlight node-by-node - find-in-page has the same issue. I suspect Safari paints highlights as a range so the newly replaced text gets highlighted as it's still in the range.

The highlighting/indication mechanism itself is mostly left to UA discretion but it's a good question as to what should happen when text changes. Filed #224

jakearchibald commented 1 year ago

Had a quick chat with @bokand and one detail became clearer (unless I misunderstood @bokand): the highlight change will only be processed on navigation/traversal if the to-highlight value has changed.

Although, that may still cause some issues.

  1. Same-document navigate to /page-1#:~:text=world, which is updated to contain 'world' somewhere, which is highlighted.
  2. Same-document navigate to /page-2#:~:text=hello, which is updated to contain 'hello' somewhere, which is highlighted.
  3. Same-document navigate to /page-3#:~:text=world, which is updated to contain 'world' somewhere, which is highlighted.

If I understand correctly, if I now traverse go(-2) (which is same-document), nothing is highlighted. This is because the old 'world' node is gone, and a new 'world' node is introduced, but highlighting isn't processed again because the to-highlight value hasn't changed.

bokand commented 1 year ago

In this example, is it the case it's all same-document because the entries mutate the URL via pushState?

Yes, I think you're right. I was assuming different (pre-fragment) URL would populate a new document which will reprocess the directive. WDYT of treating a changed pre-fragment URL the same as cross-document and processing the directive (and clearing it if the new URL didn't include a directive)?

I'm trying to avoid re-processing on fragment-only changes since

a) the text search is expensive b) in the common case, page content didn't change c) since the directive is preserved across fragment-only changes, it could cause weird behavior:

You might ask why we want to preserve the directive across fragment-only changes - this is intentional as we commonly came across pages that write to their fragment during loading, e.g.

We don't want this to clear/cancel highlighting. This does mean that history traversal on a page that uses the fragment for routing won't restore highlights but I think that's fine if it means the initial navigation can more reliably show the highlight.

bokand commented 1 year ago

To give an example of what I'm proposing:

Navigate to History Entry Update Highlights
foo.html#:~:text=hello A: foo.html directive: text=hello *
foo.html#bar B: foo.html#bar directive: text=hello
replaceState bar.html B: bar.html directive: text=hello
pushState bar.html#bar C: bar.html#bar directive: text=hello
pushState new.html D: new.html directive: *
pushState new.html#:~:text=hello E: new.html directive: text=hello *

When traversing, we'll update highlights if:

jakearchibald commented 1 year ago

In this example, is it the case it's all same-document because the entries mutate the URL via pushState?

Yeah, or via the navigation API.

I'm trying to avoid re-processing on fragment-only changes

What if the history entry had a "highlight state", and the "highlight state" had a "highlight string".

Then, highlighting is recomputed if the "highlight state" is different between history entries.

The same "highlight state" can be reference by multiple history entries (eg, it's copied when navigating same-document).

Two different "highlight states" can have the same "highlight string".

Taking a similar example from before:

  1. Same-document navigate to /page-1#:~:text=world, which is updated to contain 'world' somewhere, which is highlighted. This triggers highlight because the "highlight state" is new, therefore different to the previous entry.
  2. Same-document navigate to /page-2#:~:text=hello, which is updated to contain 'hello' somewhere, which is highlighted. This triggers highlight because the "highlight state" is new, therefore different to the previous entry.
  3. Same-document navigate to /page-3#:~:text=world, which is updated to contain 'world' somewhere, which is highlighted. This triggers highlight because the "highlight state" is new, therefore different to the previous entry.
  4. Same-document navigate to #foo. The "highlight state" from the previous entry is also referenced on this new entry. It isn't different from the previous entry, so it doesn't trigger highlight.

Therefore:

bokand commented 1 year ago

The same "highlight state" can be reference by multiple history entries (eg, it's copied when navigating same-document).

Are you suggesting doing this for all same-document navigations? e.g.

  1. Navigate to /page-1#:~:text=world which contains "world" and is highlighted
  2. Same-document navigate to /page-2 which contains a different "world" but is not highlighted because its the same highlight state
  3. Same-document navigate to /page-3#:~:text=hello which contains "hello" and is highlighted

Now if you:

I think the highlight state is a better framing than comparing URLs, but perhaps we'd only re-use a state if only the (directive-stripped) fragment changed (which I think makes it behaviorally equivalent to my approach)?

But also: does the spec have a notion of "references"? You say "it's copied when navigating same-document" but then you'd have two structs with the same data - presumably you mean that the history entry points to the previous entry's highlight state but can we do that in spec language? @annevk?

jakearchibald commented 1 year ago

Are you suggesting doing this for all same-document navigations? e.g.

Yeah, what you describe sounds like what I was thinking.

But also: does the spec have a notion of "references"? You say "it's copied when navigating same-document" but then you'd have two structs with the same data - presumably you mean that the history entry points to the previous entry's highlight state but can we do that in spec language?

Yeah, history entries also have a "document state" which can be shared by multiple history entries (entries that should use the same document).

bokand commented 1 year ago

Yeah, what you describe sounds like what I was thinking.

Isn't the back behavior I described a bit weird though? In that example, we'll end up traversing back seeing different highlights than we'd seen previously.

My understanding is that these navigation are "same-document" but are conceptually a new app/page state. It seems wrong to me for them to share highlight state.

Yeah, history entries also have a "document state" which can be shared by multiple history entries (entries that should use the same document).

Oh yeah! Ok, sgtm then, thanks!

jakearchibald commented 1 year ago

Isn't the back behavior I described a bit weird though?

Yeah. Ugh.

Your idea of reperforming the highlight if the document is the same but the URL (ignoring the fragment) is different, and the to-highlight text has changed, sounds best.