WICG / scroll-to-text-fragment

Proposal to allow specifying a text snippet in a URL fragment
Other
586 stars 43 forks source link

Cannot link to text snippet starting with hyphens (ASCII 0x2D) as common in CLI arguments #228

Closed porg closed 1 year ago

porg commented 1 year ago

User Goal

I want to link to a text snippet starting with 1 or 2 hyphens (ASCII 0x2D), as common for Unix command line arguments.

Problem

Linking to an online manpage and to a text fragment starting with 1+ hyphen(s), e.g. to -o or --output, does not jump to that target although that text snippet exists on the linked webpage.

Environment

Examples

https://x265.readthedocs.io/en/master/cli.html#:~:text=--zones

https://x265.readthedocs.io/en/master/cli.html#quality-rate-control-and-rate-distortion-options:~:text=--zones

https://www.ffmpeg.org/ffmpeg-all.html#:~:text=crf

https://www.ffmpeg.org/ffmpeg-all.html#:~:text=-crf

Research

Suspicions

Followup

bokand commented 1 year ago

This is supported but - must be percent encoded since it is used as part of the text fragment syntax. E.g. https://x265.readthedocs.io/en/master/cli.html#:~:text=%2d%2dzones

The minus character must not be escaped according to https://github.com/WICG/scroll-to-text-fragment/issues/194#issuecomment-1219951592

Perhaps you misread the comment? (I could have phrased it more clearly...):

The set of URL code points, less &, -, ,, can be used without percent encoding. All other code points must be percent encoded.

Means all you can use any URL code point other than &, -, , without percent encoding but &, -, , must be percent-encoded.

I'm going to close as I think this is working-as-intended but let me know if I missed something.

porg commented 1 year ago

Thanks for looking into this!

URL code points, less &, -, ,

Safari does string escaping of the text fragment in the address bar partially wrong

I reported this as a bug to Apple: Safari performs string escaping of text fragment in address bar partially wrong: Hyphen and Ampersand MUST be escaped but are NOT (FB12522049)

Pasting different strings into the text fragment of the URL in the address bar on ENTER performs encoding, but not always correct:

Screen Recordings

https://github.com/WICG/scroll-to-text-fragment/assets/737143/db1571dd-e7dc-43f7-ac80-4a9626d84d2d

https://github.com/WICG/scroll-to-text-fragment/assets/737143/a2110251-d0e9-44ad-8c3e-07a66617c2a8

https://github.com/WICG/scroll-to-text-fragment/assets/737143/d174181e-b92e-4229-8f31-fb2070236b3a

bokand commented 1 year ago

Browser UI isn't standardized behavior so Safari may or may not choose to make changes there - that said, I'm not sure their encoding behavior is wrong (I think Chrome behaves the same way?), it depends on context. E.g if you paste:

example.com#:~:text=foo&text=bar

If the ampersand (&) is escaped this is a text directive to foo&text=bar whereas if it's not escaped it's two text directives, the first to foo and the second to bar.