linkify by default parses Internationalized Resource Identifiers (IRI) according to rfc3987. As mentioned in #49 this behavior incorrectly extracts links without scheme when surrounded by Unicode characters without a space, which is valid in some languages. So, 地址example.org is a valid IRI, but the desired behavior is to extract URL example.org.
I added flag to url_can_be_iri that when set to false disables parsing unicode characters. The default behavior is unchanged.
LinkKind is meant to be extendable for other types of links, and I thought adding LinkKind::Iri, but that would make the library backwards incompatible.
linkify by default parses Internationalized Resource Identifiers (IRI) according to
rfc3987
. As mentioned in #49 this behavior incorrectly extracts links without scheme when surrounded by Unicode characters without a space, which is valid in some languages. So,地址example.org
is a valid IRI, but the desired behavior is to extract URLexample.org
. I added flag tourl_can_be_iri
that when set to false disables parsing unicode characters. The default behavior is unchanged.LinkKind
is meant to be extendable for other types of links, and I thought addingLinkKind::Iri
, but that would make the library backwards incompatible.