Closed jugglingcats closed 8 years ago
https://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names
Such hostnames are invalid, according to RFC.
I see no practical reasons to add such case to linkifier defaults.
Thanks for quick response! I don't see anything in the RFC restrictions that prohibit a name like "test1" as a hostname. It says hostname must consist of letters and digits, and talks about dot delimeters, but gives examples of "unqualified hostname such as csail or wikipedia".
It is quite common to use unqualified hostnames within internal networks and therefore within internal documentation (intranet), and often these hosts will have digits, eg. nexus3, intranet2, etc.
Is there any way to override/customise linkify-it to handle this case (in the context of vanilla markdown-it -- bower version)?
Many thanks for a great library.
Sure, you can modify everything you wish. All rules and partials are exposed into class instance, and you can override those with your own. It's better to dig code for details.
Any tips on where to look? Can it be done on linkify-it object or do I need to implement completely custom linkification. Code is well structured but complex so not always clear the best place to intercept/override. Have only been using markdown-it for 24hrs... ;)
You are right. Seems i've missunderstood hostname requirements, and this bug should be fixed.
Do i understand right, that even pure digits without letters are ok? (http://123/foo
, http://123.local/foo
, http://123.example.com
)
Any tips on where to look?
If you ever decide to customize rules, look at existing ones https://github.com/markdown-it/linkify-it/blob/1.2.1/index.js#L50.
Regex stubs are available in .re
property of linkifier instance. See what happens in .compile()
.
I don't know if just '123' as a hostname is valid according to the RFC, but assume someone could put it in their private DNS or even host file entry.
There are definitely pure 'digit' hostnames out on the web: 999.com is resolvable.
And hostnames starting with digits are quite common, eg: https://123-reg.co.uk.
Thanks for the code pointers. Will take a look...
As far as i understand, pending changes are:
http://
, https://
, ...):
http://123.com
//
Thoughts? Can this give false positive?
Your call, but not sure why wouldn't you treat http://999/index.html as a link, or even just http://999 for that matter? They look like clear links to me...
As for //
- I wouldn't use it myself, especially if there are different linkify rules for it.
Ok, reasonable.
http://
-> allow digits everywhere & don't check for TLDs at all (will match both local, public domains & IPs)//
-> drop locals, require allowed TLDs, allow digits before TLDs (OR allow IP address).
google.com
-> TLDs required, prohibit locals & IPsme@gmail.com
-> the same as above but may be allow IPs.Looks good to me!
It should work for all allowed protocols (for example, we are using linkify.add("rest:") to support custom links types).
If you have a time - please, post here links that are not working now and should be fixed.
So far the only one I found is trailing digit in the hostname. The linkifyier is even working with REST paths with variables, such as http://markdown-it.github.io/linkify-it/#t1=http%3A%2F%2Ftest%2Ftest%2F%7Bname%7D.
But I will post any more I find. Thank you.
There is a problem with such link, if we allow dimain parts to have all digits:
https://www.google.ru/maps/@59.9393895,30.3165389,15z?hl=ru
Part before @ is considered as username and 59.9393895
as domain. Then ,
terminates scan because the rest is invalid path.
Need to fix situation with @ somehow.
http://markdown-it.github.io/linkify-it/#t1=http%3A%2F%2Ftest1%2Ftest.pdf