Closed LinuxMercedes closed 5 years ago
This problem is not specific to wiki, but rather to the onUrl
module.
The code in question is this:
The \\pP
in trim start and trim end will trim anything in the unicode punctuation class. This is because people sometime write things like.. Blah blah (see http://example.com).
One fix is to always percent-encode parenthesis when pasting. That doesn't really seem like it solves exactly this problem though.
A more proper fix might be to do a slightly better heuristic: to assume that well formed urls will have balanced parenthasis.
That is, we could change our url matching to treat both Foo (see http://url?with(parenthesis))
and Foo http://url?with(parenthesis)
as match http://url?with(parenthesis)
in both cases by recognizing that only one punctuation mark in the first is trailing. This could still be wrong (what about urls that aren't balanced), but will probably be more accurate than it is now.
Unfortunately, that only works for punctuation which can be balanced. For .
the problem will remain, and I can't think of any better heuristic than always including or omitting it.
Seems there's a problem with parentheses in URLs?