markdown-it / linkify-it

Links recognition library with full unicode support
http://markdown-it.github.io/linkify-it/
MIT License
661 stars 63 forks source link

Domain with two dashes not being detected #63

Closed xPaw closed 5 years ago

xPaw commented 6 years ago

https://5b0ee223b312746c1659db3f--thelounge-chat.netlify.com/docs/ is a valid subdomain, but it's not being linkified.

From the looks of it hyphens are allowed except for certain restriction: https://tools.ietf.org/html/rfc5891#section-4.2.3.1

puzrin commented 6 years ago

Thanks for report and clear sample.

See https://github.com/markdown-it/linkify-it/blob/cbc0833d3355dc04122c7a666122a61e58392555/lib/re.js#L104-L106. I think this can be added via options (disabled by default). Is this really necessary (need your opinion)?

xPaw commented 6 years ago

Looks like there are a couple of places where there's URL restrictions due to markdown. Sounds like a good idea to be able to disable all these markdown restrictions.

puzrin commented 6 years ago

After thinking a bit... no options needed. Two dashes should just work, according to RFC.

I can't promiss to make this soon, but PR will be accepted without delay.

xPaw commented 6 years ago

Would it be just removing these lines? Which it seems to become same regex as src_domain_root

https://github.com/markdown-it/linkify-it/blob/cbc0833d3355dc04122c7a666122a61e58392555/lib/re.js#L103-L107

It's a bit hard to tell from just looking at the code.

puzrin commented 6 years ago

No, src_domain_root (top level domain) does not allow dashes at all. This line should exist, but be more clever. I can't say more immediately. Those regexes are mad even with comments :)

xPaw commented 6 years ago

For info: There are domains with two dashes too: www.a--b.com www.c--u.com

Or even three dashes: http://a---b.com/

puzrin commented 6 years ago

That's not TLD-s. TLD-s are .com, .eu etc.

xPaw commented 6 years ago

Yeah my bad.

puzrin commented 6 years ago

Nevermind, i think we understand each other. Fixed rule should be "allow dashes according to RFC in all domain parts except TLD".

astorije commented 6 years ago

@puzrin, is there any chance you could take a look at this? We switched to linkify-it (in https://github.com/thelounge/thelounge/pull/2397) a few months back, and about to release a new major version of our project, it would be so nice if this was fixed :)