twitter / twitter-text

Twitter Text Libraries. This code is used at Twitter to tokenize and parse text to meet the expectations for what can be used on the platform.
https://developer.twitter.com/en/docs/counting-characters
Apache License 2.0
3.07k stars 520 forks source link

Links with Text Fragments risk to get parsed incorrectly #327

Open tomayac opened 4 years ago

tomayac commented 4 years ago

One line summary of the issue here.

Links with Text Fragments risk to get parsed incorrectly.

Expected behavior

Try to share:

https://developer.mozilla.org/en-US/docs/Web/API/ServiceWorkerGlobalScope/skipWaiting#wikiArticle:~:text=Use%20this%20method%20with%20Clients.claim()%20to,client%20and%20all%20other%20active%20clients.

Actual behavior

This gets parsed as:

https://developer.mozilla.org/en-US/docs/Web/API/ServiceWorkerGlobalScope/skipWaiting#wikiArticle:~:text=Use%20this%20method%20with%20Clients.claim (only up until the parentheses)

Screen Shot 2020-06-19 at 14 43 00

(Source tweet)

Note that parentheses are fine to exist in URLs:

encodeURIComponent('(')
// '('

Expected behavior

The link to get completely parsed. A pattern to look for in URLs would be :~:text=.

tomayac commented 4 years ago

When testing Text Fragments URLs, recall that t.co links don't work with text fragments due to crbug.com/1055455.

comp615 commented 4 years ago

Spec definition: https://wicg.github.io/scroll-to-text-fragment/#syntax