Scroll To Text Fragment

nickburris commented 5 years ago

Request for Mozilla Position on an Emerging Web Specification

Specification Title: Scroll To Text Fragment
Specification or proposal URL: https://github.com/WICG/ScrollToTextFragment

Other information

https://github.com/w3ctag/design-reviews/issues/392

annevk commented 5 years ago

I guess it would help to summarize some of the issues I've seen:

Unclear processing model. It seems like this can be done without modifying the URL parser, but what exactly the processing model should be is not clear.
There are some risks with letting an attacker scroll a user to a chosen position and it's not clear the proposal goes to the fullest extent to eliminate these, e.g., https://github.com/w3ctag/design-reviews/issues/392#issuecomment-543148916.
The bits of the processing model I do understand mean that this needs to be stripped from the URL at some point to preserve user privacy (though the website can still look at the scroll offset to determine things). This creates a risk for non-implementing user agents and makes the new syntax less useful as a general way of extending fragments.
If this effectively triggers window.find() it would have been nice to clean up that first so they can use a shared primitive: https://github.com/whatwg/html/issues/3539.

(Also, if that highlight API comes of the ground I suppose it better be isolated from these highlights somehow.)

Overall though it seems like a useful feature to have.

domenic commented 5 years ago

Unclear processing model. It seems like this can be done without modifying the URL parser, but what exactly the processing model should be is not clear.

FWIW in my review of the spec I found the processing model fairly clear and precise. The entry point that you might find most clarifying are:

That said, it does appear that the draft spec does currently modify the URL parser/serializer. But the way in which it does so seems unobservable to any part of the ecosystem that doesn't specifically look at the "URL's fragment directive" concept. So I think it could be refactored transparently (i.e. with no testable effects) to put that parsing entirely in HTML (in particular in the first link above).

Edit: I see that there is an issue filed to perform that refactoring, at https://github.com/WICG/ScrollToTextFragment/issues/60.

fred-wang commented 5 years ago

4. If this effectively triggers `window.find()` it would have been nice to clean up that first so they can use a shared primitive: [whatwg/html#3539](https://github.com/whatwg/html/issues/3539).

I agree. I opened https://github.com/WICG/ScrollToTextFragment/issues/66

bokand commented 4 years ago

Update:

I think we've addressed points 1 and 2 (the latter in a yet to be merged PR https://github.com/WICG/ScrollToTextFragment/pull/70). Here's how the spec looks with the latter PR merged.

I agree with point 4 but I think this can be done as part of some cleanup at the time we'd merge this into the HTML spec. I don't think it's an issue with the feature itself or hinders interop for following implementations.

Re: point 3:

In addition to privacy, stripping the directive from the URL prevents the directive from interacting with author script in unexpected ways (see issues raised in https://github.com/WICG/ScrollToTextFragment/issues/15).

It's true that there is some risk here to non-implementing UAs. We believe the risk is low. Since the delimiter appears inside the fragment, a non-implementing UA will interpret it as a non-matching element-id fragment. In most cases this means such URLs will load at the top of the page which is a graceful fallback.

The one concern is pages that use the fragment for state. We've found pages like this that interact badly with the directive to be rare and think the benefit outweighs the risk.

As a (small) mitigation we could add a note in the spec/explainer/outreach to recommend users and tools avoid amending URLs that already have a fragment for now as well as feature detecting (where possible) whether the UA supports directives.

WDYT?

bokand commented 4 years ago

Post US-holiday ping.

Would it be possible to get an official Mozilla opinion on this feature?

bokand commented 4 years ago

There's been some discussion on twitter so I'm going to consolidate it here so it doesn't get missed:

On why we invented a new thing rather than implement fragmention (which is basically interpreting the fragment as the search string):

fragmentation doesn't work where the page uses fragment-based routing. Users might want to link to such pages and we can't tell a priori if it will work or not. It'd be confusing why some pages don't work.
We initially started with a simple fragmentation-like approach. However, after lots of trials with Google Search we found a naive text string doesn't work consistently in those use cases. Notably, there's issues with ambiguity (particularly in cases like trying to select a table column/row header or dates), and non-contiguous passages that are broken by things like images. I'll see if I can get these results published publicly. We received the same feedback from the WebAnnotations community in the HackerNews post and then again in our issue tracker.
Privacy. It makes sense to hide the text query string from even the destination page. Without the :~: syntax and fragment stripping, the page would be able infer sensitive information e.g. a user's search terms.

bokand commented 4 years ago

I'd also point out that we've engaged and taken feedback from several experts in this space ( https://github.com/WICG/ScrollToTextFragment/issues/4 is a good example). That issue has posts from the author of fragmentation, founder of hypothes.is and others (I believe related to web-annotations).

IMHO, we've been (and will continue to be) openly receptive and grateful for the feedback we've gotten as it definitely improved the outcome - we didn't build this in a vacuum.

bzbarsky commented 4 years ago

Just to be clear, no one is saying you built this in a vacuum; we're just trying to evaluate the proposal and trying to gather as much information as we can so the evaluation can be informed by that.

bzbarsky commented 4 years ago

https://bugs.chromium.org/p/chromium/issues/detail?id=961440 is relevant here as an example of https://github.com/mozilla/standards-positions/issues/194#issuecomment-566671352 item 1 -- webmd has issues when it sees a fragment that it does not control.

dbaron commented 4 years ago

My high-level opinion here is that this a really valuable feature, but it might also be one where all of the possible solutions have major issues/problems. So I think the question we should think about is how the problems of the solution chosen here compare to the problems of other options and how they compare to the value of the feature.

bzbarsky commented 4 years ago

@tantek and I looked through:

and various issues on the ScrollToTextFragment repo. We have various concerns about this standards proposal.

Some of the problems it has (e.g. lack of clear processing model) are shared by https://github.com/mozilla/standards-positions/issues/234 as well.

Also, both proposals fail to address some use cases (e.g. highlighting the entirety of a particular row on https://www.x-rates.com/table/?from=USD&amount=1) that came up during our discissions about this. Those use cases might be better-served by some other XPointer-like mechanism, possibly; more investigation is needed there.

A draft position we are proposing:

Summary position: “non-harmful” bordering on harmful (could be convinced)

There are various use cases that this sort of API could address:

Search engines linking to the searched-for text,
Users bookmarking (and sharing) intra-document positions that don’t have pre-provided anchors.
Marginalia being associated with the right part of the text.
Citations.

Our general feeling is that some of these use cases are very important to address, but that the specific proposed solution of ScrollToTextFragment is highly biased towards the “search engines” use case and has some aspects that are harmful to the future development of the web, insofar as they encourage the creation of fragile links (so either “harmful” overall or averaged with “worth prototyping” to a summary position of perhaps “non-harmful”). We think it’s worth prototyping this proposal with the harmful aspects removed.

Specific notes we had:

The :~: prefix and the stripping of the fragment specifier seem necessary to deal with fragment-routing-based websites. The WebMD case described in https://bugs.chromium.org/p/chromium/issues/detail?id=961440 is a strong indication that such sites exist that users would want to use this functionality with. While it looks ugly, we judge this part worth prototyping on balance.
The marginalia use case described at https://indieweb.org/marginalia is not supported as well as it could be, since the fragment specifier is stripped from the URL. That significantly limits the ability of sites to show appropriate marginalia UX when the user loads a link to a specific text fragment. It could probably be addressed by exposing the fragment specifier via the FragmentDirective, modulo whatever the security considerations are with exposing this information to the page. As far as we can tell, those security considerations seem to be entirely driven by the search engine use case, and impose unnecessary constraints on the other use cases. That said, exposing the fragment specifier could be added in a V2 of the API; perhaps along with allowing URLs to opt out of having it exposed. That would need to be planned for in V1 syntax, possibly.
The capability to include multiple fragment specifiers does not seem to be clearly motivated by use cases and represents significant complexity. Apart from the complexity it introduces it’s probably “non-harmful”, though the fact that each fragment specifier effectively requires a full document scan per the security considerations document means that in practice links with multiple fragment specifiers are likely not a great user experience.
The capability to specify textEnd instead of the full text to highlight disincentivizes quoting the full text. Quoting full text is a practice that is more antifragile, and thus desirable, both for capturing the intent of the link creator, and for enabling better UI for inspecting such URLs by URL inspection libraries. While the proposal acknowledges this and recommends against using textEnd, it’s not clear to us that there is sufficient motivation to include the feature at all, given the problems it can cause. If implementation experience shows that long fragments specifiers a problem in practice, we feel the feature could be added at that point, as long as we carefully define that a comma terminates the search text.
The capability to specify a prefix and suffix likewise introduces fragility, as the proposal acknowledges. Again, there seems to be a lack of clear non-hypothetical motivating use cases. While https://docs.google.com/document/d/1YHcl1-vE_ZnZ0kL2almeikAj2gkwCq8_5xwIae7PVik/edit says “Experimentation has determined that it’s often the case that a desired section of text is ambiguous and impossible to target with just a simple match string”, there is no data presented to support that, and publishing + implementation experience with Fragmentions suggests that this is not “often” a problem, and as far as we can tell has not been a problem in actual deployed practice at all. We’re open to data showing otherwise, but have not seen any so far.
The security discussion, and the spec, prevent text fragment selection on same-document navigations. This seems like it would create significant user confusion. In particular, if a user pastes a URL with a text fragment selector for the current document in the URL bar, they would expect that to work, while the spec’s security considerations section non-normatively says that it shouldn’t and the processing model normatively prevents it. This specific restriction does not seem to be well-motivated by specific security concerns that we could find, and we would like to understand the motivation here better.
The specific rules around incumbentNavigationOrigin https://wicg.github.io/ScrollToTextFragment/#should-allow-text-fragment seem to ignore the common case when that origin is null (representing a navigation coming from the browser UI), and seems to lead to undesirable outcomes in that situation. Again, we would like to understand what specific security concerns are being mitigated here and why this is the right mitigation. This still seems to be largely concerned with the search engine use case, and addresses it at the cost of user-hostile behavior.
The actual processing model for searching for text is pretty much undefined. There’s a lot of spec text there, but it’s mutually contradictory and not implementable in its current state. See https://github.com/WICG/ScrollToTextFragment/issues/73 for a basic issue that needs to be addressed before someone can even try to make sense of the text.

bokand commented 4 years ago

Thank you for the detailed feedback, there's definitely some helpful points in here and we're working on addressing these. We'd be happy to have Mozilla's (or others') help, contributions, and expertise where they can provide it.

Our general feeling is that some of these use cases are very important to address, but that the specific proposed solution of ScrollToTextFragment is highly biased towards the “search engines” use case

I'd like to clarify that, while the machine-generated search-engine use case was one we worked on, from the very beginning a primary use case has been the user sharing case (select text > context menu > "copy link to text") and are actively prototyping browser UI to make this happen. I agree there's room for debate about certain design decisions but I think the above mischaracterizes the rationale.

We've put together a doc to address some of the questions around use cases and provides representative examples; I reference it below. I'm going to incorporate this into the explainer but it's in a Google Doc for now; I realize a complaint was this should have been there from the start but the explainer is already close to a thousand lines so we were trying to keep it focused and accessible - I'll give it a refresh to make it more relevant to the current state and reflect your feedback.

I've replied below to each of the specific points - going forward, is this issue the best place to carry on these discussions or would you prefer we file bugs (where we don't have one yet) for each point in our repo instead?

The marginalia use case described at https://indieweb.org/marginalia is not supported as well as it could be, since the fragment specifier is stripped from the URL. That significantly limits the ability of sites to show appropriate marginalia UX when the user loads a link to a specific text fragment. It could probably be addressed by exposing the fragment specifier via the FragmentDirective, modulo whatever the security considerations are with exposing this information to the page. As far as we can tell, those security considerations seem to be entirely driven by the search engine use case, and impose unnecessary constraints on the other use cases. That said, exposing the fragment specifier could be added in a V2 of the API; perhaps along with allowing URLs to opt out of having it exposed. That would need to be planned for in V1 syntax, possibly.

We're in agreement here. Giving authors more control over the directive will allow use cases like Marginalia as well as handling dynamically loaded content/infinite scrollers more reliably. We've punted on it for V1 since there's lots to carefully think through and we didn't want to lose focus. We have considered that we'd want to add things like this in the future and assumed window.location.fragmentDirective would be a good place for that so I think we've left the door open to these kinds of future improvements. Please let us know if you think the proposal as spec'ed will restrict one of these anticipated additions.

The primary reason for stripping the :~: part of the fragment was for compatibility. Privacy was a desirable side-effect but we're not opposed in principle to exposing the directive via location.fragmentDirective or some other way (as you mention, the privacy differential is small).

The capability to include multiple fragment specifiers does not seem to be clearly motivated by use cases and represents significant complexity. Apart from the complexity it introduces it’s probably “non-harmful”, though the fact that each fragment specifier effectively requires a full document scan per the security considerations document means that in practice links with multiple fragment specifiers are likely not a great user experience.

I think there's clear examples from social media that this is something users want and are already doing with less linkable tools. E.g. tweet, lots more examples in the doc.

Additionally, this is helpful in cases where text is interrupted by unwanted elements like tables/ads as well as for selecting multiple items where the DOM structure isn't purely textual (e.g. tables, lists). We've put together some motivating examples in this section of our doc.

Additionally, we see this as adding only minimal complexity. The user experience / full-document scan point we think is unlikely to be a problem in practice but we can probably specify things to avoid the full-document scan for all but the first match (since they don't cause scrolling and can be done asynchronously). I'll give this some more thought.

The capability to specify textEnd instead of the full text to highlight disincentivizes quoting the full text. Quoting full text is a practice that is more antifragile, and thus desirable, both for capturing the intent of the link creator, and for enabling better UI for inspecting such URLs by URL inspection libraries. While the proposal acknowledges this and recommends against using textEnd, it’s not clear to us that there is sufficient motivation to include the feature at all, given the problems it can cause. If implementation experience shows that long fragments specifiers a problem in practice, we feel the feature could be added at that point, as long as we carefully define that a comma terminates the search text.

This was originally motivated by being able to select text across DOM nodes. e.g. Consider a <ol> where the user wants to select the whole list. Specifying the first item and the last item as a start/end pair is a convenient way to cross the <li> boundaries.

The above is now also possible by specifying each node (e.g. each <li>) as an individual &text= term so I think start/end isn't strictly necessary. Using multiple terms makes the URL longer and less readable but I agree there's room for debate whether that'll be a problem in practice.

A case we're concerned about is a user highlighting a long passage (e.g. 2-3 paragraphs). Pasting a link like that in a chat window, for example, would lead to poor UX (consider that each space has to be percent encoded).

That said, I agree we should find some way to have usage prefer the non-range format. We're open to changes here - e.g. perhaps only allowing start/end format above a certain long length so that the vast majority of cases are forced to use the less fragile exact format?

The capability to specify a prefix and suffix likewise introduces fragility, as the proposal acknowledges. Again, there seems to be a lack of clear non-hypothetical motivating use cases. While https://docs.google.com/document/d/1YHcl1-vE_ZnZ0kL2almeikAj2gkwCq8_5xwIae7PVik/edit says “Experimentation has determined that it’s often the case that a desired section of text is ambiguous and impossible to target with just a simple match string”, there is no data presented to support that, and publishing + implementation experience with Fragmentions suggests that this is not “often” a problem, and as far as we can tell has not been a problem in actual deployed practice at all. We’re open to data showing otherwise, but have not seen any so far.

Here are some more detailed examples from our experiments with Search.

I think the difference from fragmentation is that 1) this will apply across a much broader set of content (pages have to opt-in to fragmentation which biases the population of both pages and users) and 2) users will be easily able to generate these links arbitrarily and on arbitrary content. It would be confusing for users if the generated link pointed to a different part of the document than they had selected.

An additional signal here is that the WebAnnotation community reached the same conclusion, see their syntax and motivating example. In addition to the base reasoning, we believe there is additional value in having Web Annotation references and text fragment directives be easily convertible and interoperable.

Tools that generate these kinds of links can use heuristics to determine when additional context is needed. e.g. if the desired snippet is more than 2-3 words long and currently unique it's probably a good sign the context can be omitted.

There's also a counter-example to the fragility argument here in that a short snippet (e.g. one word) without context might point to one part of a document but later point elsewhere if the document is modified and a new instance of the word is added above. IMHO, this would actually be more confusing than if the context changes and the link now just points to the top of the page. Additionally, if the context changed, it's likely the text you wanted to point to also changed.

The security discussion, and the spec, prevent text fragment selection on same-document navigations. This seems like it would create significant user confusion. In particular, if a user pastes a URL with a text fragment selector for the current document in the URL bar, they would expect that to work, while the spec’s security considerations section non-normatively says that it shouldn’t and the processing model normatively prevents it. This specific restriction does not seem to be well-motivated by specific security concerns that we could find, and we would like to understand the motivation here better.

The specific rules around incumbentNavigationOrigin https://wicg.github.io/ScrollToTextFragment/#should-allow-text-fragment seem to ignore the common case when that origin is null (representing a navigation coming from the browser UI), and seems to lead to undesirable outcomes in that situation. Again, we would like to understand what specific security concerns are being mitigated here and why this is the right mitigation. This still seems to be largely concerned with the search engine use case, and addresses it at the cost of user-hostile behavior.

I believe both of these are basically unintentional side-effects/bugs in our reasoning. The case of a user modifying their URL with a text directive and performing a same-doc navigation should definitely work and we'll fix that.

The original motivation for blocking same-document navigations was "defence in depth". If a page did find a way to cause the navigation and detect the scroll, doing so from a same page navigation would make it trivial to extract any text.

However, we've since gotten to a much clearer "noopener context" restriction which might be sufficient on its own. We'll look into dropping the same-doc restriction altogether.

The actual processing model for searching for text is pretty much undefined. There’s a lot of spec text there, but it’s mutually contradictory and not implementable in its current state. See WICG/ScrollToTextFragment#73 for a basic issue that needs to be addressed before someone can even try to make sense of the text.

Sorry about that, we'll work on fixing this ASAP.

lilles commented 3 years ago

In addition, Blink now has an intent to ship support for the ::target-text selector as specified in css-pseudo which supports styling the text fragment highlight.

bokand commented 3 years ago

It's been a while so I'll summarize where we currently stand. It'd be much appreciated if anyone at Mozilla would like to help work through any outstanding issues or improvements.

Outstanding Issues

3. The capability to include multiple fragment specifiers

Our current stats show ~25% of text-fragment invocations have 2 or more selectors. Motivation (for both user and generated) provided in this section of our doc.

The complexity to support this seems quite low to me.

4. The capability to specify textEnd instead of the full text to highlight

Agree there's a balance here; however, the textStart,textEnd syntax allows highlights that cross block boundaries (e.g. table cells, lists, multiple paragraphs, etc.). It also allows quoting large blocks of text without creating unwieldy URLs.

There's also a counterargument to link fragility - the longer a passage becomes, the more brittle an exact quote becomes. e.g. if the author fixes a typo anywhere in the passage. Ranges alleviate this somewhat.

5. The capability to specify a prefix and suffix likewise introduces fragility

From our stats, about 5% of text fragment URLs provide a prefix/suffix. Motivating use cases here.

From personal usage, I've found this necessary quite often. E.g. quoting a heading section on Wikipedia requires context since the TOC at the top includes each section; this is quite common.

Motivation for this is also backed up by @dwhly and others in https://github.com/WICG/scroll-to-text-fragment/issues/4#issuecomment-464145121 who have ample experience with annotating web content.

Updates

1. The `:~:` prefix and fragment stripping

In our experience so far, we haven't noted any compatibility issues with this and it has worked well on pages which use fragment routing.

2. The marginalia use case described at https://indieweb.org/marginalia is not supported as well as it could be

Exposing the text fragment to the page sgtm, I've filed https://github.com/WICG/scroll-to-text-fragment/issues/128, though we haven't had much demand for this. I could push on this if we get engagement here.

6. The security discussion, and the spec, prevent text fragment selection on same-document navigations.

7. The specific rules around incumbentNavigationOrigin

These are now fixed in both spec and implementation:

If the navigationParam’s request has a sec-fetch-site header and its value is "none" set allowTextFragmentDirective to true and abort these sub-steps.

That is, a user can change the text fragment on a same document navigation using e.g. the address bar. This also handles other, similar user-initiated cases.

More generally, same document navigations remain restricted (e.g. same doc navigations from <a>); the reasoning being due to the helpful noise of a full-navigation making any kind of side-channel less reliable. However, given pages could already implement this themselves (e.g. fragmentation) it seems like a mild restriction.

8. The actual processing model for searching for text is pretty much undefined

This part of the spec has been entirely overhauled and I believe is now in much better shape. An external contributor has even produced a line-by-line Typescript implementation which serves as a nice confirmation.

littledan commented 3 years ago

What is Mozilla's current position on what Chrome has shipped for Scroll to Text Fragment? Have the harmful aspects been removed? Are Fragment Directives a good basis for building future APIs, such as Chrome's new App history API proposal?

bokand commented 3 years ago

FWIW - the fragment directive idea (:~:), while somewhat goofy looking, has worked out well in our experience. We haven't hit any compat issues with it and it's enabled quoting text on pages with fragment routing. The processing model for it should be (I believe) well-defined too. I can't speak for what Mozilla POV but perhaps @tantek could comment.

As far as the harmful aspects, I disagreed above that the ability to specify multiple fragments and prefix/suffix are harmful, I haven't heard any feedback on those points.

I feel pretty strongly from our experience, and others' in the annotation community, that prefix/suffix is critical.

Multiple fragments isn't critical but helpful in some cases and doesn't add much complexity - an implementation that only highlights the first matching fragment would be meaningfully interoperable with ours, so perhaps that part of the spec could be made MAY or SHOULD, rather than MUST.

I understand and share the concern about allowing a textStart,textEnd format and think it's worth debating, but there are trade offs to be made, as I've written above. It is definitely necessary in some cases.

FYI: Chrome is soon shipping a right-click menu item "Copy Link to Text", you can try it out already via chrome://flags/#copy-link-to-text. I've been finding this very helpful for a long time now (via the official and popular extension), for example, to link to explicit statements in specifications. The main drawback has been that these links don't work outside of Chrome today. It would be awesome if we could make this work across browsers.

bholley commented 3 years ago

Broadly speaking, we think that linking to text would be a nice feature for the Web. We've also seen some potential reasons for concern around privacy leaks. We think they're probably addressable with the right mitigations, but intend to take a cautious approach and give the security community time to scrutinize the implications before we consider prototyping something here.

In the mean time, we think it would be premature to use this technology as a basis for other standards.

bokand commented 3 years ago

Thanks for the response @bholley. Are there any specific concerns you could share? I recognize this feature in particular comes with new risks, as well as benefits, but we've done a lot of work to try and mitigate the risk of privacy leaks; if there are ways we can improve the risk:benefit ratio we'd be happy to collaborate.

In the mean time, we think it would be premature to use this technology as a basis for other standards.

I believe this is in response to @littledan's question about using the fragment directive (i.e. :~:) in other contexts. The concerns around privacy as I understand them are entirely to do with the text locating and scrolling aspects of the proposal. Given that the fragment directive appears to be separate and uncontroversial (at least, I haven't seen any criticisms specific to it) perhaps that could be considered separately?

littledan commented 3 years ago

Thanks for clarifying so promptly, @bholley . FWIW it looks like the WICG history proposal doesn't currently use fragment directives.

tomayac commented 3 years ago

FWIW it looks like the WICG history proposal doesn't currently use fragment directives.

It was discussed in the context of https://github.com/WICG/app-history/issues/4 and also in https://github.com/slightlyoff/history_api/issues/28.

bholley commented 3 years ago

Thanks for the response @bholley. Are there any specific concerns you could share? I recognize this feature in particular comes with new risks, as well as benefits, but we've done a lot of work to try and mitigate the risk of privacy leaks; if there are ways we can improve the risk:benefit ratio we'd be happy to collaborate.

We're aware of and appreciate the work you and your team have done to mitigate the privacy risks. Our position isn't that there is any particular known showstopper with this feature today, but rather that the associated threat landscape is an area of active research and that we'd prefer to let it bake for a bit.

In the mean time, we think it would be premature to use this technology as a basis for other standards.

I believe this is in response to @littledan's question about using the fragment directive (i.e. :~:) in other contexts.

Correct.

The concerns around privacy as I understand them are entirely to do with the text locating and scrolling aspects of the proposal. Given that the fragment directive appears to be separate and uncontroversial (at least, I haven't seen any criticisms specific to it) perhaps that could be considered separately?

I think we view both parts of the syntax here — the fragment directive and the format of the text selector — as somewhat inelegant, but also without a clear alternative that meets all the same requirements (universal compat with non-cooperating pages, and the various proposed highlighting use-cases, respectively). If this proposal ends up working out, then it could make sense to leverage the same machinery for other use-cases. But if it doesn't, it's plausible we could find cleaner syntax for those other use-cases which might not be encumbered by the same requirements.

debanjum commented 3 years ago

the associated threat landscape is an area of active research and that we'd prefer to let it bake for a bit.

How long do we want this to bake before deciding to integrate it into Firefox? The positive use-cases for deep linking are pretty great. Having Firefox support this out of the box would greatly enhance the usability of these links. And hopefully result in the standardization of it's format!

FYI, folks can use the DeepDive extension until Firefox gets native Scroll To Text Fragment support. Disclaimer: I'm the author. The extension is pretty rudimentary currently but serves the basic use-case.

tomayac commented 3 years ago

The "official" extension is likewise available for Firefox.

debanjum commented 3 years ago

Yeah, I saw that. From what I understand it only creates those links, not actually open them in Firefox?

On Sun, 16 May 2021 at 23:57, Thomas Steiner @.***> wrote:

The "official" extension is likewise available for Firefox https://addons.mozilla.org/en-US/firefox/addon/link-to-text-fragment/.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mozilla/standards-positions/issues/194#issuecomment-842059189, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQ5ZJIE3VMQBWNALLQ5E2DTOC46VANCNFSM4IHTWPNA .

tomayac commented 3 years ago

It creates such links and scrolls to them. There seems to be a change in Firefox 80, though, that broke the scrolling. We're tracking this as https://github.com/GoogleChromeLabs/text-fragments-polyfill/issues/60/.

debanjum commented 3 years ago

Oh neat, Thanks for the clarification and the link to the official extension!

On Mon, 17 May 2021 at 00:22, Thomas Steiner @.***> wrote:

It creates such links and scrolls to them. There seems to be a change in Firefox 80, though, that broke the scrolling. We're tracking this as GoogleChromeLabs/text-fragments-polyfill#60 https://github.com/GoogleChromeLabs/text-fragments-polyfill/issues/60.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mozilla/standards-positions/issues/194#issuecomment-842077079, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQ5ZJIVUFPTB5UYCANW7Z3TOC74XANCNFSM4IHTWPNA .

annevk commented 2 years ago

We've done some further analysis of the security properties as per @bholley's comments above and are planning to adjust our position accordingly: https://github.com/mozilla/standards-positions/pull/611.

zamicol commented 2 years ago

My concern with the scroll-to-text-fragment spec as written is that there is no ending delimiter defined for fragment directives.

This is a problem for composability with other schemes. Please see: https://github.com/WICG/scroll-to-text-fragment/issues/193#issuecomment-1219640246

Another significant concern with the the spec as written is that Chrome breaks the existing behavior of window.location.href and window.location.hash. To get the guaranteed URL with no removals, a call to performance.getEntriesByType("navigation")[0].name is now required. Please see the stack overflow response: https://stackoverflow.com/a/73366996/1923095

bokand commented 2 years ago

To get the guaranteed URL with no removals, a call to performance.getEntriesByType("navigation")[0].name is now required.

The inability to get the full URL via script is intentional - the performance API workaround is something of a bug that I'm hoping we can fix at some point.

There's definitely use cases where having the full URL would be useful but there's also cases where a page shouldn't be able to access that information. My hope was we could balance these by providing an API - curious if you've seen or have thoughts on: https://github.com/WICG/scroll-to-text-fragment/blob/main/fragment-directive-api.md

zamicol commented 2 years ago

I wanted to write a little more about the "I must be last" rule and point out concerns:

The "I must be last" rule is undesirable, and from my understanding of the spec, unneeded. More permissive, interoperable, and composable behavior is preferred. I think a fix would be simple and straightforward.

Problems with the "I must be last" rule:

It's mutually exclusive to any other URL scheme sharing that rule. Note that this is any URL scheme, not just scoped to fragment scheme.
It gives arbitrary precedence to only one scheme. Why should the fragment directive scheme be given that rule?
A fix is simple. (See below for more suggestions)
An ending delimiter isn't hard to describe and simple to implement.
There's no good reason preventing fragment directive from having an ending delimiter.
URL Component is the correct place for such a large rule, not a scheme inside of a URL component. If URL directive really deserved such a rule, it needs to be promoted to a URL component.
If there are ever new URL components, the "I must be last" would likely need to be addressed. (See below)

The reason why fragment's ending delimiter is the end of the URL is because the fragment has different behavior from the rest of the URL in that it isn't sent to the server. A URL is split into two parts: (sent to the server | not sent to the server), and this major behavioral break is deserving of such a rule. Fragment directives live in this existing fragment paradigm, there's no need to promote fragment directive behavior over fragment's exiting behavior.

Concerning point 0 from the list above, it would be better, still not ideal, if the "I must be last" rule was scoped to fragment instead of being scoped to URL. The easiest way to be permissively composable with existing fragment schemes and future URL components is to 1. specify an ending delimiter or alternatively 2. the spec should be update to state that "the the end of the fragment directive is the end of the fragment" (not end of the URL), and thereby inherit the existing, and standard behavior of fragment. As said previously, URL component is the correct place for that rule, not a scheme inside of a URL component. For example, a hypothetical new URL component after fragment would require a spec change where fragment would need an ending delimiter (which would likely be the start of the new component). If fragment directive is not scoped to fragment and is instead scoped to URL, as it stands now, the new hypothetical component to the URL spec would not need to fix both fragment and fragment directive.

zamicol commented 2 years ago

@bokand regarding window.location.href, window.location.hash, and document.URL, should work as expected.

I wouldn't advocate for it, but if there's new behavior needed for fragment directive, fragment directive needs to be promoted to the level of URL component, not a scheme inside of URL fragment. I admire the desire for backward compatibility, but fragment's behavior should not be adjusted. Removing fragment directive from fragment on these API calls redefines fragment.

I think the option best option here is to leave document.URL, and others, alone. Fragment directives are useful and I'm looking forward to their adoption.

zamicol commented 2 years ago

I wanted to share a real world example of where these problems are relevant.

This link (or something like it) should work, but doesn't as the spec is written: https://cyphrme.github.io/URLFormJS/demo_simple.html#:~:text=Subscribe?first_name=Bob&last_name=Smith

Only this link works: https://cyphrme.github.io/URLFormJS/demo_simple.html#?first_name=Bob&last_name=Smith:~:text=Subscribe

We're asking fragment directives to play well with others.

zamicol commented 2 years ago

Today we ran into another issue.

There is no way to get the URL if the protocol is file://

performance.getEntriesByType("navigation")[0].name does not work with file://. (I'm not sure why, I'd love to know the reason).

This means:

There is no guaranteed way to get the full URL in Chrome when the protocol is file://. (Regardless if there's a fragment directive.)
There's no way to get the full URL in Chrome when the protocol is file://. (Interested in the fragment directive, there's no way to get it.)

Edit:

I'm assuming the only way to get the URL is via the proposed API that's already enabled by default since on version 104.0.5112.79 and there's no such thing.

From that API document, document.fragmentDirective.items doesn't appear to be a thing.

Is there a way to get this to work?

As a different matter, Github appears to have some sort of conflict with fragment directives

Going to this URL: https://github.com/WICG/scroll-to-text-fragment/blob/main/fragment-directive-api.md#:~:text=the%20api%20described%20below%20

Strips out the fragment. I'm assuming they're doing some sort of sanitization.

bokand commented 2 years ago

@zamicol I've responded on https://github.com/WICG/scroll-to-text-fragment since I think that's a more appropriate venue for these issues

zellyn commented 1 year ago

fwiw, this now works in both Safari and Chrome (and, naturally, Edge). caniuse.com link

RokeJulianLockhart commented 1 year ago

https://github.com/mozilla/standards-positions/issues/194#issue-474120850

https://github.com/WICG/scroll-to-text-fragment/issues/196 relates to this.

willfs84 commented 1 year ago

I really hope Firefox gets this because it's genuinely a useful feature/functionality. I'm sure there's ways of solving any concerns. It really is a great feature that saves time.

RokeJulianLockhart commented 1 year ago

https://github.com/mozilla/standards-positions/issues/194#issuecomment-1613939700

@willfs84, https://addons.mozilla.org/en-GB/firefox/addon/link-to-text-fragment/?utm_content=addons-manager-reviews-link&utm_medium=firefox-browser&utm_source=firefox-browser exists as a workaround.

Qhilm commented 1 year ago

@rokejulianlockhart, this add-on is broken, as mentioned in the comments on the page you linked.

I opened an issue for this, but I do not have high hopes, I don't think it's maintained anymore.

Unfortunately I have zero experience in writing add-ons.

ewen-lbh commented 5 months ago

I found another extension, https://addons.mozilla.org/en-US/firefox/addon/text-fragment/, that works really well (supports copying links from selected text and jumping to text from a url with a text fragment)

It even works on Firefox for Android!

RokeJulianLockhart commented 5 months ago

https://github.com/mozilla/standards-positions/issues/194#issuecomment-1718868558

@Qhilm, it works for me, on multiple devices and OSes, on the latest stable version of Firefox. I would not have recommended it otherwise.

zcorpan commented 5 months ago

This is not the right place to discuss Firefox extensions. Thanks.

RokeJulianLockhart commented 5 months ago

https://github.com/mozilla/standards-positions/issues/194#event-2905769978

@dbaron, https://bugzilla.mozilla.org/show_bug.cgi?id=1867939#c17 would appear to indicate that this decision has been reversed.

zcorpan commented 5 months ago

Mozilla's position was updated in #611 (https://github.com/mozilla/standards-positions/issues/194#issuecomment-1031247833).

mozilla / standards-positions