Open carton-of-mice opened 8 months ago
If provided with a potential hostname with multiple dots, only the most top-level domain below the TLD is validated.
>>> import urlextract >>> print(urlextract.URLExtract().find_urls('sample :--.-.:3.2.com sample')) [':--.-.:3.2.com']
This report is related to #121 - after invalid characters are consumed, __is_domainvalid() only applies validation regex against host.split(".")[-2], ignoring invalid DNS labels in earlier parts.
host.split(".")[-2]
If provided with a potential hostname with multiple dots, only the most top-level domain below the TLD is validated.
This report is related to #121 - after invalid characters are consumed, __is_domainvalid() only applies validation regex against
host.split(".")[-2]
, ignoring invalid DNS labels in earlier parts.