RP ID is required to be a valid domain string which is the string representation of a valid domain. The definition of a valid domain cites issue 245 which raises the following points:
The algorithm for determining a valid domain does not require the original domain input to match the result outputs in steps 1 or 3.
Currently origin validation only states "Validation MAY be performed by exact string matching or any other method as needed".
It would be nice if some guidance were provided on origin validation where the RP ID and origin disagree on case alone or even more complicatedly disagree on the syntactic representation of a domain whose semantics are equivalent according to IDNA.
For example as it stands now, any domain with an _ is not a valid domain as a failure will result from applying the domain-to-ascii algorithm. This goes against point 2 raised in issue 245.
Additionally the below are all valid domains that are semantically equivalent according to IDNA, but are syntactically different:
λ.example.com
Λ.ExaMple.com
xn--wxa.example.com
Xn--Wxa.ExAmple.com
Is there any recommendation on requiring both RP IDs and origins to not only be a valid domain string but more strictly that it match exactly with the result returned from step 3 (e.g., only item 1 above is valid)? A relaxed recommendation that would allow all four items above and require them to be treated the same as each other? A recommendation that only domains with A-labels be allowed and match exactly with the result returned from step 1 (e.g, only item 3 above is valid)?
use idna::uts46::{AsciiDenyList, DnsLength, Hyphens, Uts46};
use std::io::{self, Error, StdoutLock, Write};
fn main() -> Result<(), Error> {
let mut stdout = io::stdout().lock();
idna_transform(&mut stdout, "λ.example.com").and_then(|()| {
idna_transform(&mut stdout, "Λ.ExaMple.com").and_then(|()| {
idna_transform(&mut stdout, "xn--wxa.example.com").and_then(|()| {
idna_transform(&mut stdout, "Xn--Wxa.ExAmple.com")
.and_then(|()| idna_transform(&mut stdout, "www_ww.example.com"))
})
})
})
}
fn idna_transform(stdout: &mut StdoutLock<'_>, input: &str) -> Result<(), Error> {
write!(stdout, "original input: {input}, ").and_then(|()| {
match Uts46::new().to_ascii(input.as_bytes(), AsciiDenyList::STD3, Hyphens::Allow, DnsLength::Verify) {
Err(err) => writeln!(stdout, "domain-to-ascii algorithm fails on input: {err}"),
Ok(ascii) => write!(stdout, "canonical domain with only A-labels: {ascii}, ").and_then(|()| {
let (u_labels, res) = Uts46::new().to_unicode(ascii.as_bytes(), AsciiDenyList::STD3, Hyphens::Allow);
match res {
Err(err) => writeln!(stdout, "result of domain-to-ascii algorithm causes domain-to-unicode algorithm to fail: {err}"),
Ok(()) => writeln!(stdout, "canonical domain with U-labels: {u_labels}"),
}
})
}
})
}
Output from the above program:
original input: λ.example.com, canonical domain with only A-labels: xn--wxa.example.com, canonical domain with U-labels: λ.example.com
original input: Λ.ExaMple.com, canonical domain with only A-labels: xn--wxa.example.com, canonical domain with U-labels: λ.example.com
original input: xn--wxa.example.com, canonical domain with only A-labels: xn--wxa.example.com, canonical domain with U-labels: λ.example.com
original input: Xn--Wxa.ExAmple.com, canonical domain with only A-labels: xn--wxa.example.com, canonical domain with U-labels: λ.example.com
original input: www_ww.example.com, domain-to-ascii algorithm fails on input: Errors
RP ID is required to be a valid domain string which is the string representation of a valid domain. The definition of a valid domain cites issue 245 which raises the following points:
_
among potentially other ASCII code points should be allowed.The algorithm for determining a valid domain does not require the original
domain
input to match theresult
outputs in steps 1 or 3.Currently origin validation only states "Validation MAY be performed by exact string matching or any other method as needed".
It would be nice if some guidance were provided on origin validation where the RP ID and origin disagree on case alone or even more complicatedly disagree on the syntactic representation of a domain whose semantics are equivalent according to IDNA.
For example as it stands now, any
domain
with an_
is not a valid domain as a failure will result from applying the domain-to-ascii algorithm. This goes against point 2 raised in issue 245.Additionally the below are all valid domains that are semantically equivalent according to IDNA, but are syntactically different:
λ.example.com
Λ.ExaMple.com
xn--wxa.example.com
Xn--Wxa.ExAmple.com
Is there any recommendation on requiring both RP IDs and origins to not only be a valid domain string but more strictly that it match exactly with the
result
returned from step 3 (e.g., only item 1 above is valid)? A relaxed recommendation that would allow all four items above and require them to be treated the same as each other? A recommendation that only domains with A-labels be allowed and match exactly with theresult
returned from step 1 (e.g, only item 3 above is valid)?Example code in Rust using the
idna
crate:Output from the above program: