Open kadamwhite opened 1 year ago
I can replicate this issue
WordPress 6.1.1 Gutenberg 14.6.1 TT3 Theme
I think protocols are not limited to http
or https
. Therefore, I think it is correct behavior that isURL()
determines movie:
as a URL with a custom protocol.
Also, the code you suggested determines that URLs with protocols such as mailto:
and tel:
are not URLs. My sense is that many users expect these URLs to be pasted as links.
However, since I think few users will paste URLs with irregular custom protocols, it might be a good idea to determine major protocols as correct URLs.
I noticed that the same issue was discussed in #24895. And PR #28534 is the solution to this problem.
Also, as noted in this comment, an approach using isValidHref
function seems to have been attempted.
@t-hamano Thank you for finding the older issue PR, I had not been able to locate that in my own searching!
I do believe that it's unusual for an href to have a space in it. Apple's documentation around tel:
links, for example, gives all examples in the format tel:1-....
where there is no space after the protocol. The linked PR by @mrclay checks /^\s/.test( parsed.pathname )
and assumes any pathname starting with whitespace is not valid, which matches my own assumptions.
This issue definitely describes the same problem as #24895.
I believe this was fixed via #53000. The link pasting feature only allows HTTP (s)
protocols.
The focus of whether this issue can be closed might be whether protocols other than http(s)
should also be considered URLs.
According to this comment, protocols such as tel:
and mailto:
seem to be considered rare. What do you think?
We could update Regex and add those two, but the sites usually use contact forms to avoid spam, and I've not seen the tel:
protocol in a while.
@kadamwhite, what do you think?
Description
When selecting some content and trying to paste other content into its place, certain strings are interpreted as links which should not be. This causes the selected text to be converted to an invalid link, instead of replacing the selected text with the clipboard contents.
This occurs because the check for whether a text
isURL
is to run it through the browser's URL constructor, and any non-error result is treated as proof the string is a URL. Passing a string formattedWord: Other Words
into this constructor results in a valid URL object, with not-really-URL contents. For example the stringMovie: The Revenge of Sequels II
in the example below gets parsed as such:movie:
is interpreted as a protocol, andThe Revenge of Sequels II
is treated as the pathname. (Note the leading space in that string.) This is valid on the part ofURL
, but the likelihood of a user consciously pasting a valid URL with a leading space in the path name seems excruciatingly low.I propose that we should also check the first character of the
pathname
on the generatedURL
object, and assume the content is not a URL if that character is whitespace:It's possible I am overlooking some category of URLs where a leading space in the pathname would still be valid, but this heuristic works with every link type I can think of.
Step-by-step reproduction instructions
Movie: The Revenge of Sequels II
to the clipboardMovie: The Revenge of Sequels II
) over the text "Replace me"Movie: The Revenge of Sequels II
Screenshots, screen recording, code snippet
Environment info
Please confirm that you have searched existing issues in the repo.
Yes
Please confirm that you have tested with all plugins deactivated except Gutenberg.
Yes