w3c / web-nfc

Web NFC
https://w3c.github.io/web-nfc/
Other
313 stars 69 forks source link

Fix external type validation algorithm #511

Closed zolkis closed 4 years ago

zolkis commented 4 years ago

As noted here

Probably what we should do instead is figure out if dns-part is literal IPv6 and take the first : as delimiter after it, allowing : in the name-part of external types.

zolkis commented 4 years ago

Also, as @sleevi notes here,

Let asciiDomain be the result of running domain to ASCII on domain.

You've omitted the optional boolean beStrict, which means it defaults to false. This permits certain implementation-dependent characters. It's unclear if that was intended here - that is, the relationship between the external type and the current origin are unclear me, but I would suspect that NFC-RTD makes a normative dependency on DNS, which means that beStrict should be true, even if UAs allow the current domain to contain a U+005F () LOW LINE (underbar). My general advice is "be strict in what you accept" - if you're validating caller textual input or NFC tag input (versus the current origin/domain), then omitting seems entirely reasonable.

zolkis commented 4 years ago

@sleevi PTAL

sleevi commented 4 years ago

@zolkis Does it make sense to review in a PR? That makes it easier to add in-line comments, and might save some extra commits :)

Recall that #350 came up because, at the time, there was a relationship between the document domain/origin, the registerable domain, and the NDEFRecord tag, and the issue I was trying to flag (with IPv6 and with URL Standard) was that there exist origins for which cannot be represented in NDEFRecords, leaving the security check of limiting the origin to only reading its tag in an awkward state, and needing to be nailed down. It looks like you cleaned all that up in #474 , which means much of #350 (all of it?) is obviated :)

As I understand it, that check is now fully removed, and reasonably so: Web NFC uses the external tags and domain names not as a security/ACL check, but simply as a way of ensuring unique registrations, much like an ITU/ISO/IEC Object Identifier, since the domain name system is similarly hierarchical and delegated.

As a result, I think that means you don't need to handle IPv6 anymore and your algorithm for processing the type can go back to processing the first U+003A (':') to determine the domain. If I'm understanding correctly, and the domain portion is only used for unique identification, then you do not want to allow U-Labels/IDNs, but require A-Labels/LDH-Labels (I believe this is what was trying to be done with the reg-name syntax).

Thus your parsing/validating/encoding algorithm becomes something roughly like:

  1. Find the first :
    • If none, error
  2. Split the string on :
    • Set domain equal to everything up to but excluding :
    • Set type equal to everything after :
  3. Validate domain is LDH-label
  4. Validate type is within the allowed character space
  5. (If encoding) Validate length

Now, you can shuffle these steps around in an order that makes sense. For example, implicit in this is an assumption that you've validated that the input is an ASCII string, which thus makes sure that the character length == byte length when ASCII encoded. That could be first, or that could be only part of encoding, I'm not sure what makes most sense for you.

Because you're not dealing with the document's origin/domain anymore, handling U-labels isn't necessary. Developers using WebNFC will be using explicit tags, so they shouldn't need to worry about things. A developer shouldn't just use document.domain to construct the tag, because the whole point of the tag space in NFC, is, as far as I can tell, to make it explicit the namespace of the thing you're writing.

@beaufortfrancois If I'm wrong/misunderstanding this, maybe you can set some time on my calendar to explain this for me. I really have little background in NFC here, I largely filed #350 because of the PSL dependency leading to weird specs (and the issues mentioned in https://github.com/sleevi/psl-problems), and I think #474 addressed the bulk :)

zolkis commented 4 years ago

Thanks @sleevi.

@kenchris @beaufortfrancois @leonhsl please check in the NFC NDEF and RTD specs whether my assumption in the PR that we have to support both literal IPv6 and dns-name is warranted. Also, whether checking the address/domain part is OK.

I make a small amend requiring ASCII string for input.

beaufortfrancois commented 4 years ago

From what I can read, "NFC Data Exchange Format" spec says

  • The use of IP addresses in URIs SHOULD be avoided whenever possible (see [RFC 1900]). However, when used, the literal format for IPv6 addresses in URIs described by [RFC 2732] SHOULD be supported.
  • Generators of NDEF messages SHOULD rely only on the most rudimentary equivalence rules defined by [RFC 3986]

And "Signature Record Type Definition" spec only mentions dns-name in M2M Certificate Profile.

GeneralName ::= CHOICE {
 rfc822Name IA5String (SIZE (1..128)),
 dNSName IA5String (SIZE (1..128)),
 directoryName Name,
 uniformResourceIdentifier IA5String (SIZE (1..128)),
 iPAddress OCTET STRING (SIZE (1..16)),
 --4 octets for IPV4 16 octets for IPV6
 registeredID OBJECT IDENTIFIER
}
zolkis commented 4 years ago

I've seen that in the NDEF spec, but in the NFC RTD spec (section 3, ABNF), in the dns-part clause the ABNF doesn't seem to allow IPv6 addresses.

leonhsl commented 4 years ago

By what I read from https://github.com/w3c/web-nfc/issues/511#issuecomment-570620933, and given that the latest spec is already checking

If input is not an ASCII string, is empty, or its length exceeds 255 bytes, return false.

I don't think we need to do the following checks

Let asciiDomain be the result of running domain to ASCII given domain and true (as beStrict). If asciiDomain is failure, return false. If asciiDomain contains a forbidden host code point or U+005F LOW LINE (_), return false.

but just validating that domain is LDH-lable should be OK, because now our domain has nothing to do with DNS/IDN, it just serves as a namespace.

leonhsl commented 4 years ago

Also, do we need to validate that a local type should also be an ASCII string? Or we need to change the algorithm in another way to make sure the limit of 255 bytes is correctly guaranteed.