Open ash2k opened 5 days ago
matches
supports escaping of octet values. We do this in our conformance tests. Is that not sufficient?
Indeed, it works fine! I got confused by the description of the validation rule. I read it as if it says that the data that is being validated must be utf-8 encoded, not that the pattern itself should be utf-8 encoded:
// The value of the field must be valid UTF-8 or validation will fail with a // runtime error.
This worked for me:
message HeaderValues {
// At least one item, but it can be empty.
repeated bytes value = 1 [
(buf.validate.field).repeated.min_items = 1,
(buf.validate.field).repeated.items.bytes.pattern = "^[^\\x00-\\x08\\x0A-\\x1F\\x7F]*$"
];
}
message HeaderKV {
bytes key = 1 [(buf.validate.field).bytes.pattern = "^:?[0-9a-zA-Z!#$%&'*+-.^_|~`]+$"];
HeaderValues value = 2 [(buf.validate.field).required = true];
}
Thank you!
I got confused by the description of the validation rule.
Hm, this might be copy-pasta from the pattern rule over on strings (which does have the UTF-8 requirement). I'll fix up the documentation to be clearer. Thanks!
@rodaine Thank you for an excellent library!
Hm, I've added some proper tests and I'm getting an error: runtime error: error evaluating bytes.pattern: invalid UTF-8 in bytes, cannot convert to string
. I believe this is because string(this)
fails in the CEL expression.
I think we might need an overload for matches()
that works natively on bytes
.
I think the error is generated here: https://github.com/google/cel-go/blob/master/common/types/bytes.go#L104-L106
Feature description:
Allow to check that a
bytes
field contains only allowed octets. This cannot be done with CELmatches()
because bytes in the field may not be valid UTF-8.Problem it solves or use case:
I want to validate that a
bytes
field has only certain bytes in it. Specifically, that it's a valid HTTP header name / value.HTTP header values are not UTF-8, can have binary data: https://datatracker.ietf.org/doc/html/rfc9110#name-field-values.
I also want to avoid the cost of validating HTTP header names as UTF-8 that proto does (so have to use
bytes
), but instead to validate them manually.Proposed implementation or solution:
?
Contribution:
Willing to do code review.
Examples or references:
Additional context:
https://github.com/bufbuild/protovalidate/issues/263.