w3c / vc-data-model-2.0-test-suite

W3C Verifiable Credentials v2.0 test suite
https://w3c.github.io/vc-data-model-2.0-test-suite/
Other
9 stars 10 forks source link

Use an invalid url with https protocol for invalid url tests. #32

Closed aljones15 closed 11 months ago

aljones15 commented 11 months ago

Addresses:

https://github.com/digitalbazaar/vc-data-model-2-test-suite/issues/27

Bug: it turns out between the vocab in the Vc2.0 context and the somewhat liberal way json-ld creates relative urls, our existing not-a-url tests were not failing in the way they were intended. This PR updates those tests to use a full invalid url with the https protocol. Further issues will need to be made on json-ld processors as it looks like what json-ld will turn into a relative url is kind of interesting.

aljones15 commented 11 months ago

Just so this is known the new invalid url is this one:

https ://not-a-url.org

Additionally json-ld has no issues with many invalid urls such as: ["http://s\[ËËËme.com"](https://json-ld.org/playground/#startTab=tab-expanded&json-ld=%7B%22%40context%22%3A%5B%22https%3A%2F%2Fwww.w3.org%2Fns%2Fcredentials%2Fv2%22%2C%7B%22ExampleTestCredential%22%3A%22http%3A%2F%2Fs%5B%C3%8B%C3%8B%C3%8Bme.com%22%7D%5D%2C%22type%22%3A%5B%22VerifiableCredential%22%2C%22ExampleTestCredential%22%5D%2C%22validFrom%22%3A%222023-02-23T21%3A47%3A17Z%22%2C%22issuer%22%3A%22did%3Aexample%3Aissuer%22%2C%22credentialSubject%22%3A%7B%22id%22%3A%22did%3Aexample%3Asubject%22%7D%7D) "https://1.2.3.4.5/" (5 part ipv4 fails javascript's url parser, but passes json-ld's url parser)

Both of the above json-ld will use with no errors at all. So for instance an ipv4 url with 5 parts is kind of hard for a human to spot and an understandable mistake that a URL parser will catch, but json-ld will not catch it and allow.

Concrete change suggestion is to use the new URLs you're using /and/ add a UTF-8 charater to the URL that encodes a non-ASCII UTF-8 character.

non-ASCII characters are valid characters unless used in a protocol. So for instance a url with an umulaut in it would technically need to be translated escaped, but in practice url parsers will just automatically escape the non-ascii parts of the url:

> new URL('https://motlËycrËw.com')
URL {
  href: 'https://xn--motlycrw-w1ad.com/',
  origin: 'https://xn--motlycrw-w1ad.com',
  protocol: 'https:',
  username: '',
  password: '',
  host: 'xn--motlycrw-w1ad.com',
  hostname: 'xn--motlycrw-w1ad.com',
  port: '',
  pathname: '/',
  search: '',
  searchParams: URLSearchParams {},
  hash: ''
}

so using a non-ascii character in a url in say a VC is kind of invalid, but for convenience urls with non-ascii characters most parsers just automatically escape the url making adding a non-ascii character to a url pretty much mute outside of a protocol.

> new URL('heËyps://motlËycrËw.com')
Uncaught TypeError [ERR_INVALID_URL]: Invalid URL
    at __node_internal_captureLargerStackTrace (node:internal/errors:478:5)
    at new NodeError (node:internal/errors:387:5)
    at URL.onParseError (node:internal/url:565:9)
    at new URL (node:internal/url:641:5) {
  input: 'heËyps://motlËycrËw.com',
  code: 'ERR_INVALID_URL'
}
filip26 commented 11 months ago

just thinking loudly. A positive test, testing that an issuer/verifier accepts valid URI is important, but negative one on malformed URI? How important is this negative test? Given the fact that most strings are treated as a valid relative URI.

msporny commented 11 months ago

just thinking loudly. A positive test, testing that an issuer/verifier accepts valid URI is important, but negative one on malformed URI? How important is this negative test? Given the fact that most strings are treated as a valid relative URI.

Yes, we might want to remove this test if implementers feel like this test isn't adding anything worthwhile... if you use an invalid URL, well, your software won't be interoperable with other software that might not process that URL.

Let's merge this and see if implementers complain.