markdown-it / linkify-it

Links recognition library with full unicode support
http://markdown-it.github.io/linkify-it/
MIT License
655 stars 63 forks source link

Allowing more than 4 dots #81

Closed openGeeksLab closed 4 years ago

openGeeksLab commented 4 years ago

Why don't you allow more than 4 dots? for example https://www.google.com/search?sxsrf=ACYBGNTJFmX-GjNJ8fM-2LCkqyNyxGU1Ng%3A1575534146332&ei=Qr7oXf7rE4rRrgSEgrmoAw&q=clover&oq=clover&gs_l=psy-ab.3..0i67j0l9.2986.3947..4187...0.2..0.281.1366.1j0j5......0....1..gws-wiz.......0i71j35i39j0i131.qWp1nz4IJVA&ved=0ahUKEwj-lP6Iip7mAhWKqIsKHQRBDjUQ4dUDCAs&uact=5 It's just google links and there are more than 4 dots, is it wrong link for you?

with help just a little change I've given the opportunity to recognize such links

Please merge it, or explain to me why it works in such a way

puzrin commented 4 years ago

Such kind of modules should not produce false positive (fetch tail which is not link). So, it's very dangerous to create too wide rules in anvance.

Approach is to create as narrow rules as possible, and extend those after bug reports if no conflicts found.

I understand your case, and have questions:

openGeeksLab commented 4 years ago

@puzrin Thank you for a quick answer. 1) The dots in the end, can't have a payload, so they shouldn't be parsed 2) If we're talking about links from google search, it can be any amount of dots, it directly depends on the request for example https://www.google.com/search?sxsrf=ACYBGNQIsAdD2HHcLKxVjn2b6o55dScWaA%3A1575558086811&ei=xhvpXbOAMaiorgSExIaACQ&q=interesting+stories.......&oq=interesting+stories.......&gs_l=psy-ab.3..0i22i30l10.2395.12446..15641...7.2..0.121.1331.13j1......0....1..gws-wiz.......0i71j35i39j0i67j0j0i203.CRMR3t9i-rw&ved=0ahUKEwiz-9ig457mAhUolIsKHQSiAZAQ4dUDCAs&uact=5

puzrin commented 4 years ago

https://github.com/markdown-it/linkify-it/blob/master/test/fixtures/links.txt#L168 - here is place to insert your link (with one empty line before and after)

openGeeksLab commented 4 years ago

I had to change regex a little, tests were passed, please check

puzrin commented 4 years ago

I had to change regex a little, tests were passed, please check

This rules are euristic, very fragile. We can not do "little changes" without explanation. Need details, why additional change was done.

openGeeksLab commented 4 years ago

yeah, I agree with you. I need time to figure out into it more deeply. What if I remove these last changes, and will prepare new PR for this? Will it be ok for you?

puzrin commented 4 years ago

What if I remove these last changes, and will prepare new PR for this? Will it be ok for you?

You can just rewrite history and do forced push here, in the same PR. That will work.

As far as i underastand, only 3 changes needed:

  1. Remove 4 from regexp
  2. Add comment about multiple dots
  3. Add your link to tests

If tests passe - i'm ok to merge. If more regexp changes needed - that should be reviewed/discussed.

openGeeksLab commented 4 years ago

@puzrin please, check

oleg-rdk commented 4 years ago

@puzrin is there something else that holds this PR? if so, let us know, so we can help & merge it quicker