Ethical, paradigmal, and UX concerns?

will-ca commented 4 years ago

Has there been any discussion or consensus on any of the following issues?

Is it a good idea to allow sources that link to other pages, such as search engines and blogs, to transparently impose visual modifications onto the appearance of third-party webpages by default?

- This breaks one of the fundamental implicit assumptions/precedents of the web— which is that the visual content presented on a webpage hosted at a given domain is ultimately determined wholly by the administrators of that domain. (Yes, third-party content has been around for a long time, but in all prior cases, it's been things like ads and embeds that the web administrators have to explicitly set up, maintain, and allow— This seems to have been implemented as a blanket opt-out that automatically applies to all websites.)
- I could easily see a not-insignificant number of users being confused or taking offence at a webpage because of an unfortunate automatically-generated highlight, or a link sent shared by a contact, for a trivial example.
- This also changes the role of search engines and link aggregators. By imposing their emphasis onto third-party webpages and making the browser autoscroll past all prior content, linking websites arguably gain the ability to editorialize, and transparently, without either informing the user or gaining the consent of the linked-to webpages.

Is it a good idea to transparently and inconsistently automate on-page navigation with neither the user nor the webmaster's involvement?

- I think there is a potential problem here in that the new behaviour both replaces and deviates from how both users and webmasters are used to websites working. For most of the history of the web, when clicking on a link, you could generally expect to be taken to that page and nothing more. The few hundred milliseconds of time between clicking a link and the page loading can give the user time to prepare to either (1) navigate the page if they're searching for a specific section or (2) start reading from the top.
- If this change becomes prevalent, then when clicking a link, there will be some chance that the browser will automatically scroll to a section in the middle of the page (if I'm interpreting the documentation correctly). There will be no reasonable way to tell before clicking whether that will happen either, so the behaviour will necessarily often be unexpected by the user. New features aren't necessarily bad, but breaking with established conventions and introducing unpredictable behaviour may be a good way to foster user frustration.
- (Arguably, id attributes already behave this way. But even in that case, web administrators have full control over the number and position of named anchors on their pages— meaning that they are free to ensure they all point to places that make sense on their own. Additionally, a lot of links to named anchors are intra-site or even intra-page, which further helps ensure consistent, useful, and predictable behaviour. Wikipedia's named sections are a good example of this.)

Is it a good idea to introduce another convention which breaks the paradigm that a single URL consistently points to a single corresponding resource presented in a single format— thereby further breaking bookmarks and complicating link shares?

- What happens when a user bookmarks a webpage that's already in their bookmarks, just with a different URL fragment? Will this feature lead to duplicate bookmarks?
- What happens when a user saves or shares a URL with a text fragment to a page that is subsequently edited so that it no longer contains that fragment? As I understand the documentation, this wouldn't have to require any major edits to the page, only a single character change (fixed typo or re-word) to the text string that is used to define the fragment. Unlike named anchors which can be moved by the webmaster on content changes, I don't think there's really any way to make these fragments forwards-compatible with future page content changes because the fragment definition relies on the content itself. Additionally, this behaviour seems like it'd be the most useful for the kinds of long reference pages that may frequently experience the types of minor changes which would break it (like on Wikipedia). Without cooperating with webmasters or control from the user, it seems like it'd be very hard to avoid this functionality being incredibly fragile.

bjonesneu commented 4 years ago

The most important aspect of this feature seems to be the ability to bring focus to a certain piece of content while not losing sight of the rest of the surrounding content.

I notice that many content sites allow their readers to highlight text and share that snippet and a link to the article. In the past couple of years there have been many examples where the recipients of those shares read only the snippet and not the rest of the article, understanding the snippet completely out of context.
This is no different to copy/pasting any text on the web or from printed material and sharing it in any form. A major difference now is the ease for a user to share, for a user to skim the info they receive and the speed and degree the information can be spread.
It would seem that controlling the display of information from the hosted domain is already a huge challenge because of this. Are many pages never even loaded because people are only looking at the shared snippets?
I would be looking at this feature to help by providing a more standard approach which would hopefully facilitate better reading practices. If users are taken directly to a specific passage on a page, it seems they would be more likely to read the surrounding context than if they never opened the page and just read the snippet.

Visual highlighting seems to open a number of questions of readability, for example if a page is modified and the selector now highlights the wrong section or the colors make the content unreadable.

tilgovi commented 4 years ago

This feedback is all really valuable. I want to chime in to remind everyone that we should be careful to distinguish between specification feedback and Google product feedback. The specification considers highlighting, scrolling, etc to be non-normative. The focus is on specifying the fragment directive syntax and the matching algorithm, i.e. the parts that are required for interoperability.

will-ca commented 4 years ago

@tilgovi In theory, sure.

But in practice, I think it's also worth considering that the definitions and "non-normative" suggestions in the specification itself will necessarily lend themselves to certain types of uses, and that the reference implementation in a market-dominant product will also inevitably guide how that specification is implemented and applied in other browsers.

bokand commented 4 years ago

I think many of the raised issues are philosophical in nature so I don't think there's an objective answer to these. I can only provide my opinions here.

This breaks one of the fundamental implicit assumptions/precedents of the web— which is that the visual content presented on a webpage hosted at a given domain is ultimately determined wholly by the administrators of that domain.

I think this overstates the visual impact of the text fragment on the content. The highlight is transient (it's dismissed with a click) and thus isn't modifying the page content in any meaningful way. We've gotten feedback that the highlight doesn't visually work well on all pages and some users find it distracting, I agree we should provide a way for pages to style it and perhaps experiment with making it clearer that the highlight can be dismissed or making it more transient (e.g. fade out after a short time).

UAs already allow modifying content in non semantic ways when users request it or the UA believes it's in the user's interest, e.g. font boosting, find-in-page, etc.

I could easily see a not-insignificant number of users being confused or taking offence at a webpage because of an unfortunate automatically-generated highlight, or a link sent shared by a contact, for a trivial example.

Like any new feature, it can be used in good and bad ways. I don't think this is fundamentally worse than links in general. People are free to link to offensive content. We could certainly iterate on making the distinction clearer and make sure users understand what's happening. Expectations will also evolve over time as this becomes normalized.

This also changes the role of search engines and link aggregators. By imposing their emphasis onto third-party webpages and making the browser autoscroll past all prior content, linking websites arguably gain the ability to editorialize, and transparently, without either informing the user or gaining the consent of the linked-to webpages.

This ability already exists using regular fragments. To the broader point, the user has full context; they can see that the page is scrolled, they can read the rest of the page and full context if that's what they want. I agree this gives extra power to aggregators which could have subtle consequences. On balance though, I believe this will actually be better for users, information will be quicker to find and information connects on a more granular and semantic level. E.g. a citation to a long paper can be quickly attributed and verified.

On the other hand, from the user's point of view, why should a page be able to decide what part I start at and how it must be navigated? Should we disable find-in-page because the user can quickly skip over large sections of a page? The web already allows deep linking into the subresources of a site: individual pages, images, fragments, midway into a video; do you consider these problematic? When I'm reading a book, I'm free to start off at any page, I don't see why the web should be different.

when clicking on a link, you could generally expect to be taken to that page and nothing more

My assumption is that when users click links it's because they're trying to find or do something. The user agent and surrounding ecosystem should do what it can to make the process of locating what they're interested in easier. Obviously, we should aim to make this predictable and avoid confusion; there's room for experimentation here with UI and how things are presented and I'm sure we won't get it perfect, especially at first. But I disagree that we should avoid adding fundamental capabilities because there's a possibility that expectations might be violated.

Expectations evolve. A web page today can determine my precise location - that's something that would have been unexpected at one point. It can certainly be abused; however, it's enabled many more positive developments as well.

There will be no reasonable way to tell before clicking whether that will happen either, so the behaviour will necessarily often be unexpected by the user

I disagree. The text fragment is visible in the URL. Savvy users can tell this will happen. If this really is an issue, it's possible that UAs could provide UI to surface this information better to less savvy users. Given that the snippet is encoded directly in the URL this could actually be done in a very friendly way.

But even in that case, web administrators have full control over the number and position of named anchors on their pages

It's true, this does shift a bit of control over initial presentation to referrers and users and away from the page, certainly there's a shifting of expectations. Personally, I don't really see the practical problem here but I think reasonable people can disagree on this point.

What happens when a user bookmarks a webpage that's already in their bookmarks, just with a different URL fragment? Will this feature lead to duplicate bookmarks?

This is a technical issue that's easily solved by UAs (if it really is something that will be a problem for users in practice). Users are already free to make duplicate bookmarks based on differing protocol schemes, fragments, arguments, etc. In some cases this may be desirable, e.g. maybe I want to bookmark multiple passages of a long paper for future reference? This actually seems like a positive development of text fragments to me.

Without cooperating with webmasters or control from the user, it seems like it'd be very hard to avoid this functionality being incredibly fragile.

This is the link rot problem that's been discussed at length. Text fragments don't make this any worse. If the text on the page changes and the fragment no longer matches, the page simply loads at the top as it would have today.

On the contrary, having the contextual information about what content in the page the link was referencing actually improves the situation here. For one, the text that was being referenced appears directly in the URL so the content wasn't lost! Secondly, the user/UA now knows that relevant parts the page have changed and the information may now be out of date. Currently we don't do anything here but I can imagine UI to surface to the user that the referenced text wasn't found. This is helpful and something you can't do today.

Let me give a concrete example: suppose I make an assertion (somewhere on the web) that "California is the world's leading producer of solar energy" and I provide a non-text-fragment citation [1]. Skeptical, you visit the page. There exists content there about energy but nothing to back up my claim. Did I misunderstand the cited content? Did the cited content change? Am I making it up? It's difficult to tell. But if you have a text fragment that tells you exactly which piece of text I'm referencing you can quickly determine this.

bokand commented 4 years ago

Visual highlighting seems to open a number of questions of readability, for example if a page is modified and the selector now highlights the wrong section or the colors make the content unreadable.

I think there's room for improvement and iteration in this regard. Certainly allowing better integration with page colour is at the top of the list.

tezlm commented 4 years ago

I feel like the scroll/highlight text feature should be off by default. If websites want to let users share a text snippet, then it should include the scroll to text fragment part of the URL. At least include setting to disable this.

will-ca commented 4 years ago

I think this overstates the visual impact of the text fragment on the content. The highlight is transient (it's dismissed with a click) and thus isn't modifying the page content in any meaningful way. We've gotten feedback that the highlight doesn't visually work well on all pages and some users find it distracting, I agree we should provide a way for pages to style it and perhaps experiment with making it clearer that the highlight can be dismissed or making it more transient (e.g. fade out after a short time).

Personally, I find it appreciably harder to focus on the rest of the information on the page when there's a bright yellow highlight, and that my initial impression of the page has already formed to some degree by the time I dismiss the fragment. But I do think making it styleable, better-communicated, and maybe temporary would go a ways to alleviating that.

UAs already allow modifying content in non semantic ways when users request it or the UA believes it's in the user's interest, e.g. font boosting, find-in-page, etc.

Yes, but I think the important distinction there is that it's non-semantic.

I'd argue that performing navigation and (especially) highlighting without it being explicitly requested by the user crosses over into semantic changes. The highlight looks pretty close to something that a news website could have as deliberate emphasis in their article, for example.

Like any new feature, it can be used in good and bad ways. I don't think this is fundamentally worse than links in general. People are free to link to offensive content. We could certainly iterate on making the distinction clearer and make sure users understand what's happening. Expectations will also evolve over time as this becomes normalized.

The specific problem that I'm worried about isn't people linking to offensive content; it's linking to unoffensive content but using the decontextualization provided by the snippet to incite offence— all while blaming the original page author for it, due to the established expectation that links can't change page contents.

While any feature can in theory be used in good and bad ways, I am worried that creating a way for people to impose visual highlights onto specific snippets in third-party pages will lend itself to being used in too many bad ways to be worth the good ones.

This ability already exists using regular fragments. To the broader point, the user has full context; they can see that the page is scrolled, they can read the rest of the page and full context if that's what they want. I agree this gives extra power to aggregators which could have subtle consequences. On balance though, I believe this will actually be better for users, information will be quicker to find and information connects on a more granular and semantic level. E.g. a citation to a long paper can be quickly attributed and verified.

I think citations in papers are probably one case where this feature could actually be really nice, but I'm concerned about how most other long forms of writing— news articles, thinkpieces, stories, etc— would be impacted by letting aggregators choose where the reader starts.

On the other hand, from the user's point of view, why should a page be able to decide what part I start at and how it must be navigated? Should we disable find-in-page because the user can quickly skip over large sections of a page? The web already allows deep linking into the subresources of a site: individual pages, images, fragments, midway into a video; do you consider these problematic? When I'm reading a book, I'm free to start off at any page, I don't see why the web should be different.

I think the book is a good example, because the web arguably already gives you the freedom to start at any point in the same way. In fact, between the scroll bar, links, and the "Find" feature in all modern browsers, I'd argue that the web makes jumping to a random point much easier than a book does.

But the automatic navigation occurs when the link is followed, not before it's created— So I'd argue that this feature is more like someone handing you a book and forcing you to start from a specific page, regardless of whether you actually want to. And because it's the referrer, rather than the reader, that chooses where the book starts, the reader loses a lot of initial context and the author loses control of how their work is presented. (E.G. In a book, you'd still probably look at the Table of Contents or flip through quickly to get a loose idea of what it broadly says before jumping to one page in particular, and you'd therefore have context other than what the referrer wants you to have.)

Expectations evolve. A web page today can determine my precise location - that's something that would have been unexpected at one point. It can certainly be abused; however, it's enabled many more positive developments as well.

That requires very explicit user consent, though (beyond approximate Geo-IP stuff, AFAIK).

I know a highlight isn't exactly as immediately dangerous as your physical location, but given how it can affect the information presented on the page, I believe it still shouldn't be a blanket opt-out.

I disagree. The text fragment is visible in the URL. Savvy users can tell this will happen. If this really is an issue, it's possible that UAs could provide UI to surface this information better to less savvy users. Given that the snippet is encoded directly in the URL this could actually be done in a very friendly way.

I think saying "savvy users can tell this will happen" implicitly presumes and accepts that "non-savvy" users— I.E. probably most users— won't be able to tell it will happen (at least in the current/presumptive implementation).

On the contrary, having the contextual information about what content in the page the link was referencing actually improves the situation here. For one, the text that was being referenced appears directly in the URL so the content wasn't lost! Secondly, the user/UA now knows that relevant parts the page have changed and the information may now be out of date. Currently we don't do anything here but I can imagine UI to surface to the user that the referenced text wasn't found. This is helpful and something you can't do today.

Let me give a concrete example: suppose I make an assertion (somewhere on the web) that "California is the world's leading producer of solar energy" and I provide a non-text-fragment citation [1]. Skeptical, you visit the page. There exists content there about energy but nothing to back up my claim. Did I misunderstand the cited content? Did the cited content change? Am I making it up? It's difficult to tell. But if you have a text fragment that tells you exactly which piece of text I'm referencing you can quickly determine this.

I don't think it would be easy to tell whether the citation was valid just from a text fragment, if the page content changes, given that it would also be easy to lie by doctoring the text fragment.

It'd be much easier to doctor the fragment and gain the semblance of legitimacy it'd confer, in fact, than to check whether it was ever valid.

I'm not saying that a lot of people necessarily would do that, but I think the fact that it's possible makes it a potential net loss for verifiability.

I obviously think these concerns are potentially applicable in a wide range of situations, but as an example, I'm terrified of what the U.S. political parties (and foreign adversaries) could do by linking to specific, decontextualized snippets from their opponents to sow discord.

jidanni commented 4 years ago

I think this overstates the visual impact of the text fragment on the content. The highlight is transient (it's dismissed with a click) and thus isn't modifying the page content in any meaningful way. We've gotten feedback that the highlight doesn't visually work well on all pages and some users find it distracting, I agree we should provide a way for pages to style it and perhaps experiment with making it clearer that the highlight can be dismissed or making it more transient (e.g. fade out after a short time).

Well in #145 I can already see the case coming: "Mom, didn't you see the yellow?" Even if they have the latest browser version.

I'm saying whatever you do, make sure all users will see it. Not gone for half (those who it has already faded for by the time they switch to that window or finished talking on the phone, those who have budged the page one millimetre, etc.)

bokand commented 11 months ago

Closing out old issues - I think there isn't anything actionable here but feel free to file a new issue.

WICG / scroll-to-text-fragment

Ethical, paradigmal, and UX concerns? #109

Has there been any discussion or consensus on any of the following issues?