tsutsu3 / linkify-it-py

Links recognition library with full unicode support
Other
15 stars 8 forks source link

Add support for unicode email addresses (RFC 6530 and following) #46

Open arnt opened 1 year ago

arnt commented 1 year ago

This is as good as it can be using the builtin re module. Using the regex module would permit another improvement.

As it stands, this matches å as a combined character, which is by far the most common way to encode å. It does not match two things:

Switching to the regex module would permit detecting both of those, but they're rare cases and it's not obvious to me that catching some rare cases justifiies pulling in another module.

tsutsu3 commented 1 year ago

I think it's a great idea.
However, since it behaves differently from linkify-it, I will not merge it.

arnt commented 1 year ago

I see, that makes sense to me. Please leave this open and I'll submit corresponding PRs to linkify-it and I suppose I might as well do linkify-it-rb too.