Closed digitalmoksha closed 5 months ago
Would an additional option be necessary? Would the user of just —unsafe
not expect these restrictions removed? Or does that go against the spec? Just trying to figure out whether another option is necessary.
Hmm, I was worried about exposing something that would make it "more unsafe", but if you're already using --unsafe
, then you're already getting normal markdown links with those types of urls, and the <smb:///Volumes/shared/foo.pdf>
syntax works as well.
Their spec does say
An extended url autolink will be recognised when one of the schemes
http://
, orhttps://
, followed by a valid domain, then zero or more non-space non-< characters according to extended autolink path validation
and
An extended protocol autolink will be recognised when a protocol is recognised within any text node. Valid protocols are:
- mailto:
- xmpp:
So obviously it breaks that. And cmark-gfm -e autolink --unsafe
with smb:///Volumes/shared/foo.pdf
as input doesn't recognize it as a link.
The spec doesn't say anything about unsafe
, but the help says Render raw HTML and dangerous URLs
.
So I think you're right, we probably don't need the extra option.
Well, indeed; we might want an option to extend for which schemes autolinks will be made. The current code is uh, a bit weird — likely unnecessarily so — as a result of its direct port from C.
There's also of course a lot of battle-testing included in that weirdness, owing for the differences in safely (as in not-over-eagerly) recognising www.…
links, http://
or https://
ones, as well as mailto:
or xmpp:
ones — what follows each set differs slightly, and trying to apply the same rules to all doesn't work. smb://
should work reasonably the same as http://
-type ones, though, so that shouldn't be too much of an issue.
All the checks are good, just thinking about relaxing the scheme check.
You know, we've already got the relaxed-autolinks
option, that allows detecting in brackets. And we have that option being passed into process_autolinks
What about using that to bypass the SCHEME check in url_match
. Changing this
let cond = |s: &&[u8]| size - i + rewind >= s.len() && &&contents[i - rewind..i] == s;
if !SCHEMES.iter().any(cond) {
return None;
}
into
if !relaxed_autolinks {
let cond = |s: &&[u8]| size - i + rewind >= s.len() && &&contents[i - rewind..i] == s;
if !SCHEMES.iter().any(cond) {
return None;
}
}
My testing shows that works well. And then --unsafe
will still do it's normal thing, allowing vbscript
, etc through.
I went ahead and created https://github.com/kivikakk/comrak/pull/380 using the relaxed-autolinks
Of course we can change it if we need to.
Currently when using the
--unsafe
option withcomrak
, normal markdown links allow all schemes except forjavascript:
,vbscript:
,file:
, anddata:
. So[smb](smb:///Volumes/shared/foo.pdf)
generates<p><a href="smb:///Volumes/shared/foo.pdf">smb</a></p>
I would like autolinks to be able to allow the same schemes, using a new option
--unsafe-autolinks
An example usage is in GitLab, where we filter to allow autolinking
smb:
links. Sosmb:///Volumes/shared/foo.pdf
would autolink.An
--unsafe-autolinks
option would allow thecomrak
user to filter all links based on the schemes they support.