Consider modelling failure reasons into Limbo spec

tetsuo-cpp commented 1 year ago

One thing I've noticed during development is that it's very easy for chains to be rejected via the wrong check. Unless you debug the test yourself, it's difficult to know whether you're exercising the intended behaviour or whether it's bouncing off some other check. And once we've verified that it's being rejected for the right reason, there's no guarantee that the logic will remain that way over time.

Obviously we don't want to bind Limbo to pyca/cryptography in any meaningful way but I was wondering if there's value in having information about why a particular test ought to fail. Each harness can then optionally map their own error messages to the Limbo failure reason enumeration to check whether it hit the right check or not.

tetsuo-cpp commented 1 year ago

@woodruffw This is low priority, but just something I've been thinking about.

woodruffw commented 1 year ago

Yes, thanks for calling this out! This is something that's also been in the back of my mind.

One challenge here is that each implementation we might want to test has its own error codes (and stability rules/lack of rules around them).

For implementations that return some kind of stringified error message, we could maybe do something like a mapping of <implementation-identifier> to message, e.g.:

openssl-3: "some error message"
pyca: "another error message"

(maybe with regexps?)

and then each implementation's handler could check for its own identifier and attempt to match it.

tetsuo-cpp commented 1 year ago

I was thinking that we'd push that onto each individual harness. So in Limbo, we just say that a test ought to fail due to a reason like: name_constraint_mismatch_dns.

Then each individual harness would be responsible for doing something like (excuse the syntax, hope you get the idea):

if expected_fail_reason == "name_constraint_mismatch_dns":
    assert(actual_error_msg.match("tetsuo-ssl-error: mismatching dns name constraint (.*) found"))

That way, Limbo doesn't need to know what implementations that it's being used to test. That would feel more "pure" but we should do what is more practical so if there's an advantage to baking that information into Limbo, let's just do that.

The main problem I foresee is that the error messages (whether they're textual or just a return code) might not be at the same level of granularity as what Limbo specifies. Actually, our pyca work would have the same issue. If we fail to validate a chain, we just return an error along the lines of: "no valid chain found blah blah" without any detail on what chains we tried and why they were invalid. So all of that name constraint code that we added recently can't be tested meaningfully with these failure reasons unless we refactor and figure out how to return more detailed errors. I expect we're probably going to run into similar issues with other implementations too (from my admittedly surface level experience with openssl, the error messages haven't been anything to write home about).

woodruffw commented 1 year ago

(Whoops, accidentally closed.)

That way, Limbo doesn't need to know what implementations that it's being used to test. That would feel more "pure" but we should do what is more practical so if there's an advantage to baking that information into Limbo, let's just do that.

Yeah, this is a good point -- limbo probably shouldn't reference any implementation in particular, so baking error strings like this is probably a bad idea.

If we fail to validate a chain, we just return an error along the lines of: "no valid chain found blah blah" without any detail on what chains we tried and why they were invalid. So all of that name constraint code that we added recently can't be tested meaningfully with these failure reasons unless we refactor and figure out how to return more detailed errors. I expect we're probably going to run into similar issues with other implementations too (from my admittedly surface level experience with openssl, the error messages haven't been anything to write home about).

Yeah, this is going to be a uniform issue with validators, and is arguably something they shouldn't care about: the validator is really only required to return a yes/no answer, and there's no "right" way to flatten a set of failed prospective validation paths into a concise and cogent human-readable error message.

At the best, I think the most we can hope for is somewhat consistent error messages when the EE/leaf itself is invalid or causes an error; for anything in the rest of the chain, nothing is guaranteed.

C2SP / x509-limbo

Consider modelling failure reasons into Limbo spec #43