robinst / linkify

Rust library to find links such as URLs and email addresses in plain text, handling surrounding punctuation correctly
https://robinst.github.io/linkify/
Apache License 2.0
201 stars 12 forks source link

Support schemaless GitHub/GitLab urls #42

Open egrieco opened 2 years ago

egrieco commented 2 years ago

URLs such as git@github.com:robinst/linkify.git are common when working with GitHub/GitLab repositories. It would be good if linkify could reliably detect the whole URL in such cases.

linkify web demo partially detecting github urls

I'm opening this issue to make others aware as well as to remind myself to fix this when I have time.

Related issues: #17 and #39

robinst commented 2 years ago

Hm not sure about this, as it's not really a URL in the strictest sense – the full form like ssh://git@github.com/robinst/linkify.git is a URL and works with this library today.

To support this (thinking out loud), we'd need an option to recognize SSH style "URLs", and then we'd need to extend email address scanning to also handle those and be active if the option is enabled.

I'm going to leave this open to gauge further interest, but I'm not planning to work on it for now.

egrieco commented 2 years ago

Fair points, but it doesn't sound like you are opposed to the feature.

Is there any objection if I work on this?

egrieco commented 2 years ago

Need

The reason I think this is necessary is that both GitHub and Gitlab provide SSH urls to repos in this format by default e.g.

git@github.com:robinst/linkify.git

Ironically, the link auto-highlighting on GitHub has the same problem highlighting the full link.

Emails

I cannot find any spec for email or mailto: links showing a colon immediately after the final domain component of the final email address as being valid.

Ambiguity

I do realize that there is a possibility for ambiguity where the GitHub/GitLab username is fully numeric.

It might be a good idea to add a LinkKind::Ambiguous to handle cases that are not clear. Further, being able to specify a policy to disambiguate to one kind or other could help a lot.