Closed gibson042 closed 2 years ago
Originally reported at https://github.com/keybase/client/issues/22453 as a failure of the Keybase client to linkify https://en.wikipedia.org/wiki/Dunning–Kruger_effect .
The issue seems to stem from pathCont being too narrowly defined; it does not include the full range specified in RFC 3987:
pathCont
ipchar = iunreserved / pct-encoded / sub-delims / ":" / "@" … iunreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" / ucschar ucschar = %xA0-D7FF / %xF900-FDCF / %xFDF0-FFEF / %x10000-1FFFD / %x20000-2FFFD / %x30000-3FFFD / %x40000-4FFFD / %x50000-5FFFD / %x60000-6FFFD / %x70000-7FFFD / %x80000-8FFFD / %x90000-9FFFD / %xA0000-AFFFD / %xB0000-BFFFD / %xC0000-CFFFD / %xD0000-DFFFD / %xE1000-EFFFD
"https://en.wikipedia.org/wiki/Dunning–Kruger_effect" contains U+2013 EN DASH –, which is in the %xA0-D7FF range but has a General_Category of Dash_Punctuation (Pd) (erroneously not included in xurls.go midChar/endChar/etc.).
–
midChar
endChar
Thanks for reporting, and for the detailed investigation! Would you like to send a PR?
Originally reported at https://github.com/keybase/client/issues/22453 as a failure of the Keybase client to linkify https://en.wikipedia.org/wiki/Dunning–Kruger_effect .
The issue seems to stem from
pathCont
being too narrowly defined; it does not include the full range specified in RFC 3987:"https://en.wikipedia.org/wiki/Dunning–Kruger_effect" contains U+2013 EN DASH
–
, which is in the %xA0-D7FF range but has a General_Category of Dash_Punctuation (Pd) (erroneously not included in xurls.gomidChar
/endChar
/etc.).