Closed pintassilgo closed 1 month ago
So you fully considered needed chars by "links_regex"? No manual work needed?
Sorry, I didn't understand your comment.
I'm proposing to update the default value of links_regex
adding, at least, ),.`;
as chars that must not be included in URL when appearing at the right edge.
Maybe some other chars too, like “”’‘?!
. I would include these too, but I understand if you disagree.
Actually, I thought a little more about it and I believe ?
and !
are also a must.
So my proposed updated version for default.json
:
"links_regex": "\\b(mailto:)?\\w[\\w\\-\\+\\.]*@\\w[\\w\\-\\.]*\\.\\w{2,}\\b|\\b(https?://|ftp://)\\w[\\w\\-\\.@]*(:\\d+)?(/([~\\w\\.\\-\\+\\/%@!%]|\\(.*?\\))*)?(\\?[^<>'\"),.`;!?\\s]+)?(\\#[\\w\\-\\./%:!]*)?",
The added part compared to current release is ),.`;!?
.
If you agree to also add “”’‘
, great, but I'm fine if you reject. These ones aren't important.
applied your fix, thanks.
about “”’‘
. they are not ASCII so they are hard to include to ASCII pascal code. or maybe not. yet I missed them.
Thanks. You can close this whenever you want, but I believe you also intend to update default.json
to reflect the change you made.
updated default.json too. closing.
I just noticed a new issue on this topic (suggest to reopen).
Sometimes there are numbers between dot in URL, usually representing IP. Cuda is breaking those links because of the dot. The same applies to comma and others, but .
and ,
are the most affected.
The code should be improved to only stop the link when these chars are followed by \s
.
Try pasting this in Cuda:
https://example.com/?180.200.208.36
https://example.com/?15,50
https://example.com/?a)a
https://example.com/?a]a
https://example.com/?a>a
https://example.com/?a'a
https://example.com/?a"a
https://example.com/?a`a
https://example.com/?a;a
Then you can think about where the link should break for each one.
Current results:
.
, ,
and ;
surely need to be fixed. Others I'm not sure.
>
and `
. In all other cases, the link goes til the end of the line..,;
are allowed in the middle of links, other chars aren't.Let's also see what is markdown behavior:
https://example.com/?180.200.208.36 https://example.com/?15,50 https://example.com/?a)a https://example.com/?a]a https://example.com/?a>a https://example.com/?a'a https://example.com/?a"a https://example.com/?a`a https://example.com/?a;a
So markdown follows all links to the end...
Edit: back to initial report, I believe :
should also be removed from URL when the char appears at the end.
Made the fix. now it's better? http://uvviewsoft.com/c/
Yes, fixed the cases from my previous comment and also escaped almost all the remaining cases from initial report, becoming more similar to Sublime. Thanks.
-
and +
should be included in the URL when it's the last char, what do you think? They are commonly used in some encodings. By fixing that, I guess we're done.
Should link the entire line: https://example.com/?ok+ Also: https://example.com/?ok-
Fixed more, for plus/minus chars.
More: http://example.com/A&E
Is a complete link for Markdown, VSCode, Sublime... but not for Cuda, in which link currently ends before &
.
Edit:
More:
Full link in all the three (Markdown, VSCode and Sublime): http://example.com/A*e http://example.com/A=e http://example.com/A{e http://example.com/A[e http://example.com/A$e http://example.com/A(e http://example.com/A|e http://example.com/A;e http://example.com/A,e
Full link in Markdown and Sublime, but not for VSCode (at least }])
should be fixed, because {[(
are parsed):
http://example.com/A}e
http://example.com/A]e
http://example.com/A)e
http://example.com/A'e
http://example.com/A"e
http://example.com/A`e
http://example.com/A"e
Fixed, thanks.
Last fix broke parsing links in markdown format, example:
[kdotool](https://github.com/jinliu/kdotool/releases/latest/).
Link should end in last /
, but Cuda is including ).
.
Some closing chars such as ]})"'"
and also ` should not be included in link when following char is a word delimiter such as space, dot, comma, linebreak... .,;:\n
.
Edit: other example of this issue:
httpChannel.setRequestHeader('Referer', 'https://www.google.com.br/', false);
In Cuda, the link is including ',
instead of ending in /
.
Thanks for notice, will see how to fix the regex.
Fixed. will change in default.json soon.
I guess there's no official standard, any char can be part of the URL. But some, when appearing at the end, are usually treated as boundary and not included in links.
Above you see markdown rules, used by GitHub and many popular webpages and webapps. I just typed the URLs as plain text and they were automatically parsed to generate links when I submitted this comment.
You can choose which set of chars to escape, but there are some important ones that are universally ignored as part of the URL, Cuda must remove them from URL:
From what I see, currently only the last tree are escaped in Cuda.
If it was me to decide, I'd follow Sublime rules. so these ones would be exceptions too:
But I understand if you disagree.
My main request here is to add
)
,,
,.
,`
, and;
as exceptions by default just like'
and"
already are.user.json
line updated with the main request chars added:the added part was
),.`;