WICG / scroll-to-text-fragment

Proposal to allow specifying a text snippet in a URL fragment
Other
589 stars 42 forks source link

Expose text-fragment to page #128

Closed bokand closed 10 months ago

bokand commented 4 years ago

For compatibility reasons, we strip the fragment directive from the URL before page script has a chance to run. Even if it was available to the page, it would be difficult for the page to know in all cases which piece of text the fragment matches, if any, since it would require effectively re-implementing the browser's implementation and running it on the page.

As mentioned in Mozilla's position of this feature, there exist interesting use cases for knowing the text fragment, such as marginalia. We should expose this information so authors can make use of it for such cases.

We have a window.location.fragmentDirective object used for feature detection (may move to document soon, see https://crbug.com/1057795). This would probably be a good place to add this information. I don't have any concrete API shape in mind yet; as a starting point, how about:

location.fragmentDirective ::= [ Directive* ]

Directive ::= {
  name: 'text',
  data: 'foo-,quote,-bar'
  range: Range
}

Where fragmentDirective is an array of Directive objects (text or otherwise). Each Directive would contain the type (assuming others may be added in the future), raw text data, and type-specific data. In our case, a Range object pointing to the located text, or null if it wasn't found.

Curious if anyone else has thoughts on how this should look. At a start, we should remove parts of the spec and explainer that forbid UAs from exposing this information.

eligrey commented 3 years ago

Something like location.fragmentDirectives structured as URLSearchParams, perhaps with a corresponding URLFragmentDirectives constructor which can also take full hashes as input would be ideal imo.

Currently if you want to interoperate with fragment directives, you'll need all of these helpers for cross-browser support:

// Get full 'navigation' URL including fragment directives
const getNavigationURL = (realm = globalThis) =>
  realm.performance
    .getEntries()
    .find(({ entryType }) => 
      entryType === 'navigation'
    )?.name
  || realm.location.href;

// Get URL fragment directives as URLSearchParams
const getFragmentDirectives = (url = getNavigationURL()) => {
  const fragment = url.indexOf('#');
  const directives = fragment !== -1 ? url.slice(fragment).indexOf(':~:') : -1;
  return new URLSearchParams(
    directives !== -1 ? url.slice(fragment + directives + 3) : '',
  );
};

// Check if browser hides fragment directives from navigated URLs
const doesBrowserHideFragmentDirectives = () => {
  const frame =
    document.createElementNS('http://www.w3.org/1999/xhtml', 'iframe');
  // prevent reflow from insertion
  frame.style.display = 'none';
  frame.width = frame.height = '1';
  // test URL that supports most Content-Security-Policy configs
  frame.src = 'about:blank#:~:';
  document.documentElement.append(frame);
  const isHidden = !frame.contentWindow.location.hash;
  frame.remove();
  return isHidden;
};

// Clear fragment directives from current page location
const clearLocalFragmentDirectives = () => {
  // location.hash = '' does not clear directives when they are hidden
  if (doesBrowserHideFragmentDirectives()) {
    history.replaceState(null, null, '');
  } else {
    location.hash = '';
  }
};
eligrey commented 3 years ago

Also nit: the current location.fragmentDirective stub is a misnomer. These should be directives (plural) imo even though scroll-to-text is currently the only directive natively supported by browsers.

tomayac commented 3 years ago

Also nit: the current location.fragmentDirective stub is a misnomer. These should be directives (plural) imo even though scroll-to-text is currently the only directive natively supported by browsers.

+1. And multiple fragments are supported today: http://example.com/#:~:text=for&text=asking-,for.

bokand commented 3 years ago

Thanks, that's all very useful feedback.

Also nit: the current location.fragmentDirective stub is a misnomer

The nomenclature being used (at least by me) is that the fragment directive is the entire part of the fragment after :~: regardless of how many individual directives it may contain.

const doesBrowserHideFragmentDirectives = () => {

FYI: you can feature detect this based on the presence of document.fragmentDirective.

const fragment = url.indexOf('#:~:');

FYI: The delimiter (i.e. :~:) need not immediately follow the #. E.g. example.com#elementIdOrRoutingState:~:text=FooBar

flackr commented 3 years ago

Do developers need some way to be notified when the fragment changes? e.g. similar to the hashchange event.

bokand commented 3 years ago

It's possible we could have something specific to the fragment directive changing, but it would be just a convenience. Today, the hashchange event is fired when the fragment directive changes (even though script can't see a change to location.hash) so one could already do some work to make it specific to the directive.

See the processing the fragment directive part of the spec for why this is.

eligrey commented 3 years ago

@bokand Thanks for pointing that out! I've updated my getFragmentDirectives code snippet to account for this.

I've left the doesBrowserHideFragmentDirectives check as-is, as it's much more robust to test behavior than ossify an API that I expect to change.

tilgovi commented 2 years ago

There is something nice about the highlight mechanism being unobservable, even if scrolling is not. I worry that making the directive observable undermines an interesting privacy benefit of stripping the fragment directive from the URL, which is that the fragment directive stays known only to the user and the user agent, but not the page.

I don't know how beneficial it is to hide the fragment directive from the page. For example, I don't think users would expect that the page doesn't know the fragment directive, if they think about it at all. Nevertheless, I wanted to record the implication.

zcorpan commented 1 year ago

As mentioned in https://github.com/WICG/scroll-to-text-fragment/issues/223#issuecomment-1489075160 and https://github.com/mozilla/standards-positions/issues/194#issuecomment-566671352 exposing the fragment directive to the page would roughly expose search terms which is not great for privacy.

The common ancestor will also match :target (exposed to JS via document.querySelector()), but maybe the fidelity of that is similar to that of checking the scrolling position?

eligrey commented 1 year ago

I proposed a fragment directives access spec that attempts to address the privacy concerns with sensitive directives: https://github.com/eligrey/fragment-directives

bokand commented 10 months ago

I'm going to close this since it didn't make it into the initial incubation and I'm now trying to close out this repo in favor of moving the spec into HTML. I think further enhancements and proposals would be more productively debated in the HTML standard.