servo / rust-url

URL parser for Rust
https://docs.rs/url/
Apache License 2.0
1.31k stars 325 forks source link

idna::punycode::encode_str() wrong conversion ? #884

Closed dandyvica closed 10 months ago

dandyvica commented 10 months ago

Describe the bug encode_str("ουτοπία.δπθ.gr") gives ..gr-pldmw6azdale9bn

Online decoders give xn--kxae4bafwg.xn--pxaix.gr

What's wrong?

valenting commented 10 months ago

encode_str should only be used for a single label.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=de549ac54f91648849021b2e0dfc0b6f

use idna; // 0.4.0
fn main() {
    let x = idna::domain_to_ascii("ουτοπία.δπθ.gr");
    println!("{:?}", x);

    let x = idna::punycode::encode_str("ουτοπία.δπθ.gr");
    println!("{:?}", x);

    let x = idna::punycode::encode_str("ουτοπία");
    println!("{:?}", x);
}

Output:

Ok("xn--kxae4bafwg.xn--pxaix.gr")
Some("..gr-pldmw6azdale9bn")
Some("kxae4bafwg")

I suppose we could make encode_into return an Err if the input contains any dots here, but that shouldn't really be necessary if you use idna::domain_to_ascii instead.

dandyvica commented 10 months ago

@valenting Thanks a lot for the explanation !