WICG / scroll-to-text-fragment

Proposal to allow specifying a text snippet in a URL fragment
Other
585 stars 43 forks source link

Add an ending fragment directive delimiter #193

Open zamicol opened 1 year ago

zamicol commented 1 year ago

The rule stripping everything after and including the :~: delimiter from the fragment will cause issues with other existing and proposed schemes.

If another fragment scheme shared the same "I must be last" rule, it would be mutually exclusive to fragment directives. This incompatible circumstance is undesirable, and this inability of composability breaks the spirit of the URL standard. To be permissive and play well with with other URL fragment schemes, a minor change is needed.

To fix this, I suggest the addition of an ending delimiter, :~:. This allows the fragment directive to be located anywhere in the fragment, increases compatibility with existing and future schemes, and precludes accidentally removing other schemes from the URL if proceeded by a fragment directive.

To adapt an example from the README, these links should work:

https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/OOZIrtSPLeM:~:text=test
https://groups.google.com/a/chromium.org/forum/#:~:text=test:~:!topic/blink-dev/OOZIrtSPLeM

Other content in the fragment should be permitted, and it should be easy to hand write.

zamicol commented 1 year ago

There are two more alternatives I can think of. I'll append the alternatives to this list of three:

  1. Each fragment scheme needs to define it's own starting and ending delimiter. For example, the fragment scheme "fragment directive" would need to define a starting delimiter of :~: and an ending delimiter of :~:.
    • Alternatively, define a fragment scheme ending delimiter universally, such as the ? character.
  2. All fragment schemes need to be aware of other fragment scheme delimiters, and their scope of concern stops the starting delimiter of another scheme.
  3. The ending delimiter for all fragment schemes is an uninterpreted/invalid unescaped characters in the fragment. (One of the 20 characters not used by the scheme itself.)

Personally, I think option 2 would work better than defining ending delimiters. This in theory should work:

https://cyphrme.github.io/URLFormJS/demo_simple.html#:~:text=Bob?first_name=Bob&last_name=Smith&email_address=bob@something.com&phone_number=1234567890&subscribe_latest_news=true

Since "?" is not defined by fragment directive, the fragment directive scheme stops at ? and preserves the rest of the URL for another fragment scheme.

Another option is to employ some combination of both 0 and 2. An explicit ending delimiter may be defined, while the implicit starting delimiter would be "a character not included in the scheme".

Personally, I think ? serving as the universal fragment scheme ending delimiter is ideal.


Applied background:

Fragment queries are like queries except they appear in the fragment and are not sent to the server by the browser. Since they are not sent to the server, fragment queries are more useful for sensitive information.

Fragment queries should be allowed to appear after a fragment directive. Instead, as the spec is written, this is not permitted:

https://cyphrme.github.io/URLFormJS/demo_simple.html#:~:text=Bob?first_name=Bob&last_name=Smith&email_address=bob@something.com&phone_number=1234567890&subscribe_latest_news=true

Only this form is permitted:

https://cyphrme.github.io/URLFormJS/demo_simple.html#?first_name=Bob&last_name=Smith&email_address=bob@something.com&phone_number=1234567890&subscribe_latest_news=true:~:text=Bob

bokand commented 1 year ago

Only had a chance to skim quickly and about to go on vacation for a few days so I'll take a closer look when I get back but seems reasonable at first glance.

Fragment queries should be allowed to appear after a fragment directive.

Why? Is there a reason the fragment query couldn't always be placed before the fragment directive? I can see the convenience argument, it's more ergonomic to simply append it...but this could be solved with better APIs/tooling (e.g. URL.fragmentDirective?) Just wondering if I'm missing something.

zamicol commented 1 year ago

The "I must be last" rule requires all other schemes to give fragment directive positional privilege. With a delimiter, all schemes are egalitarian. As a result, composability is simple.

I don't see good reason for fragment directive to not have some sort of ending delimiter and to expect positional privilege.

Please see my comment here:

https://github.com/mozilla/standards-positions/issues/194#issuecomment-1220905982

bokand commented 1 year ago

I'm having a hard time seeing the issue since there isn't really a concept of a "fragment scheme" (or query, or path, etc.

I don't see good reason for fragment directive to not have some sort of ending delimiter and to expect positional privilege.

Adding it would significantly increase complexity and edge cases. For example, what do we do if delimiters aren't matched? When are the different "schemes" evaluated (e.g. can I do :~:text=startText:~:some-other-fragment:~:,endText:~:, will that parse to a successful text fragment? It's certainly possible to create well-defined behavior but this does add to complexity so it has to be balanced against a strong benefit.

responding to some of the points brought up in https://github.com/mozilla/standards-positions/issues/194#issuecomment-1220905982

The reason why fragment's ending delimiter is the end of the URL is because the fragment has different behavior from the rest of the URL in that it isn't sent to the server

There's a very close analogue here in that the fragment is made available to page script whereas the fragment directive is not.

If there are ever new URL components, the "I must be last" would likely need to be addressed.

You'd have to address this rule for fragments themselves. Fragment directive is just a special parsing (in HTML mime type) of the fragment, it isn't specified as part of a URL at all.

IMHO, URLs are by design very positional, there's no precedent for delineating e.g. the path or the query, they're defined by their order in the HTTP scheme.

This link (or something like it) should work, but doesn't as the spec is written: https://cyphrme.github.io/URLFormJS/#:~:text=Subscribe%20to%20the%20latest%20news?first_name=Bob&last_name=Smith

I'm struggling to see how this link is generated, given that text fragments aren't made (intentionally) available to the page. Normally a page would just write to the fragment which wouldn't include the text directive. If both the fragment queries and fragment directive are being generated at the same time, why can't the creator just order them with the directive at the end (similarly to how existing URLs have to order regular queries and fragments?).

I think it's fair to say that that we lack convenient new APIs and interaction with existing APIs (e.g. https://github.com/WICG/scroll-to-text-fragment/issues/74) could be improved. I'm not sure adding complexity to fragment parsing is worth it though.

zamicol commented 1 year ago

Fragment query, fragment directive, and others (like those listed by Wikipedia) are examples of fragment schemes. It would be ideal if these schemes are composable and compatible. Imho, composability should take precedence over affording a single scheme positional privilege.

Fragment directive itself can propose a fragment scheme delimiter, offering it as a means of composability.

From before:

I think ? serving as the universal fragment scheme ending delimiter is ideal.

Especially since there's precedence of using ? as a delimiter.

As far as ending delimiters and complexity, ending delimiters are well established in HTTP. For example, here's a simple section of Go code exploding cookies using cookie's two delimiters, ; and =.

// getCookies explodes the given cookies string as a key:value map.
func getCookies(cookieString string) (cookies map[string]string) {
    cookies = map[string]string{}
    for _, s := range strings.Split(cookieString, ";") {
        ss := strings.SplitN(strings.TrimSpace(s), "=", 2)
        if len(ss) != 2 {
            //Cookie is in invalid format.  
            continue
        }
        cookies[ss[0]] = ss[1]
    }
    return cookies
}
zamicol commented 1 year ago

On the latest version of Chrome, this is the current behavior for this URL: https://cyphrme.github.io/URLFormJS/demo_simple.html#:~:text=Subscribe?first_name=Bob&last_name=Smith

image

bokand commented 7 months ago

FYI - I'm closing out any remaining non-spec issues as I'm trying to migrate this repo to HTML, at this point this repo is no longer a good place to debate enhancements or proposals. Future work should probably be taken up in the HTML standards venue.

RokeJulianLockhart commented 7 months ago

https://github.com/WICG/scroll-to-text-fragment/issues/193#issuecomment-1854636533

@bokand, then you should be closing them as unplanned rather than completed, else users shall read their e-mails believing that the proposals have been implemented. Additionally, if you have control of the other repository, transfer the issues there. Regardless, specifically where do you propose these be reopened? I don't know what venue you refer to.

bokand commented 7 months ago

Apologies - didn't notice there was a distinction

bokand commented 7 months ago

The eventual destination of this spec will be https://github.com/whatwg/html (edit: fixed link to point to github repo)

RokeJulianLockhart commented 7 months ago

https://github.com/WICG/scroll-to-text-fragment/issues/193#issuecomment-1854668938

@bokand, would it be okay for me to file a new issue for this at https://github.com/whatwg/html/issues/new?assignees=&labels=&projects=&template=0-new-issue.yml pre-emptively?

bokand commented 7 months ago

I don't have any special insight there but it couldn't hurt - worst case is folks there disagree and we can wait or reopen it here...

You might want to reference https://github.com/whatwg/html/issues/8282 which tracks upstreaming this spec to HTML

RokeJulianLockhart commented 7 months ago

https://github.com/WICG/scroll-to-text-fragment/issues/193#issuecomment-1854752228

@bokand, based upon how https://github.com/whatwg/html/issues/8282#issuecomment-1830061794 remains unanswered and that the issue you reference is not yet complete, closure due to https://github.com/WICG/scroll-to-text-fragment/issues/193#issuecomment-1854636533 appears too pre-emptive in this case. I certainly don't believe I can upstream an issue about a specification not yet implemented, as it would contribute to a precedent which might not be accepted by them.

bokand commented 7 months ago

Ok - I'll reopen this so we don't lose it.