addr-rs / addr

Parse domain names reliably and quickly in Rust
MIT License
51 stars 12 forks source link

Subdomains with underscores aren't handled #20

Open antonok-edm opened 11 months ago

antonok-edm commented 11 months ago

I noticed that parse_domain_name will produce an IllegalCharacter error if there is an underscore in a subdomain.

use addr::psl::List;
use addr::parser::DomainName as _;

fn main() {
    // ok
    List.parse_domain_name("zn-ed65ynwxvsuk9lf-cbs.siteintercept.qualtrics.com").unwrap();

    // panics
    List.parse_domain_name("zn_ed65ynwxvsuk9lf-cbs.siteintercept.qualtrics.com").unwrap();
}

According to this stackoverflow post, it doesn't seem to be "best practice" to include an underscore in a subdomain. However, there are several real-world examples of URLs (shared in that post as well as the one I found above), all of which appear to have no issues in modern browsers and CLI tools. So it would be great to have it supported by addr as well.

alcore commented 2 months ago

parse_domain_name handles hostnames, in which underscores are illegal. For DNS names which may include underscores (and trailing periods), you likely want parse_dns_name.