Open vembacher opened 1 year ago
One for the TODO would be something for examples/
Nice work btw!
I added examples for consistency and inclusion proofs.
I will also add some tests in the future to the methods I added to the Rekor models. What test data should I use for this? As these require log entries + keys.
Also I'm unsure whether my checks to ensure whether proofs and checkpoints are valid together are correct. What is the correct way to do this?
Sorry, this has been sitting on my review list for a long time. I need some time to get familiar with the feature being implemented :sweat:
Sorry, this has been sitting on my review list for a long time. I need some time to get familiar with the feature being implemented 😓
No worries! I'm not in a rush to complete this.
I received an error with the current implementation when I tested for the entry with index 25579
.
I received the following error although the verification should have succeeded.
InclusionProofError(MismatchedRoot { expected: "db0822d0508b54bfb770ba5d3394d331bd9a22c91d9a8c0b906e7ea2208903be", got: "4d006aa46efcb607dd51d900b1213754c50cc9251c3405c6c2561d9d6a2f3239" })
The underlying error is, I believe, related to the assumption that the body is formatted as canonical JSON when it is appended to the log. Indeed, to verify that the body is included in the merkle tree, the implementation currently reserializes the body using a canonical formatter. However, I think the original entry wasn't formatted using a canonical formatter (see files below).
We can't make the assumption that the body was formatted canonically, so I think the solution should be to keep the "original" base64-decoded body as bytes in the LogEntry
struct and use this to verify the inclusion proof instead of reserializing it.
If you agree with the fix, I'm happy to work on it.
The following code:
let rekor_config = Configuration::default();
let rt = tokio::runtime::Runtime::new().unwrap();
let log_entry = rt.block_on(get_log_entry_by_index(&rekor_config, 25579)).unwrap();
let mut encoded = vec![];
let mut ser = serde_json::Serializer::with_formatter(
&mut encoded,
olpc_cjson::CanonicalFormatter::new(),
);
log_entry.body.serialize(&mut ser).unwrap();
println!("{}", String::from_utf8(encoded).unwrap());
returns:
{"apiVersion":"0.0.1","kind":"rekord","spec":{"data":{"hash":{"algorithm":"sha256","value":"ce9a7c82f32194995888758cf107ef0cc52e0b8cdce73b4240658ee9e73783cb"}},"signature":{"content":"MGUCMD3oKzgsGnPAkJEXegDIsdlh4BFCQbM6jng4Sy3axY/+2tlK97oe/CkxabT1ZXUqCAIxAJDq+zLfRZZEJD5DvaKhFEu+Jm+jD4UXc3CaZp2MSajiralmtalA6fSGCXjwGfUzOw==","format":"x509","publicKey":{"content":"LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUNrakNDQWhpZ0F3SUJBZ0lVQU0rK0dYRFN5bUNPSW82YmxMMG5EZngxb21nd0NnWUlLb1pJemowRUF3TXcKS2pFVk1CTUdBMVVFQ2hNTWMybG5jM1J2Y21VdVpHVjJNUkV3RHdZRFZRUURFd2h6YVdkemRHOXlaVEFlRncweQpNVEEzTWpnd09ETTNOREphRncweU1UQTNNamd3T0RVM05ERmFNQUF3ZGpBUUJnY3Foa2pPUFFJQkJnVXJnUVFBCklnTmlBQVJjMDMrUU4vTHBrOGpqUFQwTmV5a01ucm9mMnpZUkJxNm05ei9TMXhRSzduZnZhU3grRjUrTEtwN3gKR2ExbHY2SWNvRXdwUHA2MUdsYnd5S0VQVWJLdzJrYnJyRVpPMnhKV3kxb0VEUHBYMlJqcTBYS0RZcEF5Zi9mQwoyZzJjSnVtamdnRW5NSUlCSXpBT0JnTlZIUThCQWY4RUJBTUNCNEF3RXdZRFZSMGxCQXd3Q2dZSUt3WUJCUVVICkF3TXdEQVlEVlIwVEFRSC9CQUl3QURBZEJnTlZIUTRFRmdRVTRBUDhtTkI4ejhSZFJyTVVLZ1A2Mm0xUFErd3cKSHdZRFZSMGpCQmd3Rm9BVXlNVWRBRUdhSkNreVVTVHJEYTVLN1VvRzArd3dnWTBHQ0NzR0FRVUZCd0VCQklHQQpNSDR3ZkFZSUt3WUJCUVVITUFLR2NHaDBkSEE2THk5d2NtbDJZWFJsWTJFdFkyOXVkR1Z1ZEMwMk1ETm1aVGRsCk55MHdNREF3TFRJeU1qY3RZbVkzTlMxbU5HWTFaVGd3WkRJNU5UUXVjM1J2Y21GblpTNW5iMjluYkdWaGNHbHoKTG1OdmJTOWpZVE0yWVRGbE9UWXlOREppT1daallqRTBOaTlqWVM1amNuUXdIZ1lEVlIwUkFRSC9CQlF3RW9FUQpZM1JoWkdWMVFHZHRZV2xzTG1OdmJUQUtCZ2dxaGtqT1BRUURBd05vQURCbEFqRUE3TTJwSzhRUFRrSGs1bzZ0CmdnampZdjBLV1BUajRKUTAwU3RjR0xqa1g3SU1iNC9HdXpYRkQ4czZDOEd3NmpwMEFqQW1Xa2JROTVsMzlnUGQKR2pjUjBSQURaT0dYb0NPQURwOE5lSzhBL2dKdWdnR0ZINHZYZ2l1ODJsQm5MOEZSc09jPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg=="}}}}
when a direct call to the api:
curl 'https://rekor.sigstore.dev/api/v1/log/entries?logIndex=25579' | jq -r '.[ (keys_unsorted)[0] ].body' | base64 -d
returns:
{"apiVersion":"0.0.1","spec":{"data":{"hash":{"algorithm":"sha256","value":"ce9a7c82f32194995888758cf107ef0cc52e0b8cdce73b4240658ee9e73783cb"}},"signature":{"content":"MGUCMD3oKzgsGnPAkJEXegDIsdlh4BFCQbM6jng4Sy3axY/+2tlK97oe/CkxabT1ZXUqCAIxAJDq+zLfRZZEJD5DvaKhFEu+Jm+jD4UXc3CaZp2MSajiralmtalA6fSGCXjwGfUzOw==","format":"x509","publicKey":{"content":"LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUNrakNDQWhpZ0F3SUJBZ0lVQU0rK0dYRFN5bUNPSW82YmxMMG5EZngxb21nd0NnWUlLb1pJemowRUF3TXcKS2pFVk1CTUdBMVVFQ2hNTWMybG5jM1J2Y21VdVpHVjJNUkV3RHdZRFZRUURFd2h6YVdkemRHOXlaVEFlRncweQpNVEEzTWpnd09ETTNOREphRncweU1UQTNNamd3T0RVM05ERmFNQUF3ZGpBUUJnY3Foa2pPUFFJQkJnVXJnUVFBCklnTmlBQVJjMDMrUU4vTHBrOGpqUFQwTmV5a01ucm9mMnpZUkJxNm05ei9TMXhRSzduZnZhU3grRjUrTEtwN3gKR2ExbHY2SWNvRXdwUHA2MUdsYnd5S0VQVWJLdzJrYnJyRVpPMnhKV3kxb0VEUHBYMlJqcTBYS0RZcEF5Zi9mQwoyZzJjSnVtamdnRW5NSUlCSXpBT0JnTlZIUThCQWY4RUJBTUNCNEF3RXdZRFZSMGxCQXd3Q2dZSUt3WUJCUVVICkF3TXdEQVlEVlIwVEFRSC9CQUl3QURBZEJnTlZIUTRFRmdRVTRBUDhtTkI4ejhSZFJyTVVLZ1A2Mm0xUFErd3cKSHdZRFZSMGpCQmd3Rm9BVXlNVWRBRUdhSkNreVVTVHJEYTVLN1VvRzArd3dnWTBHQ0NzR0FRVUZCd0VCQklHQQpNSDR3ZkFZSUt3WUJCUVVITUFLR2NHaDBkSEE2THk5d2NtbDJZWFJsWTJFdFkyOXVkR1Z1ZEMwMk1ETm1aVGRsCk55MHdNREF3TFRJeU1qY3RZbVkzTlMxbU5HWTFaVGd3WkRJNU5UUXVjM1J2Y21GblpTNW5iMjluYkdWaGNHbHoKTG1OdmJTOWpZVE0yWVRGbE9UWXlOREppT1daallqRTBOaTlqWVM1amNuUXdIZ1lEVlIwUkFRSC9CQlF3RW9FUQpZM1JoWkdWMVFHZHRZV2xzTG1OdmJUQUtCZ2dxaGtqT1BRUURBd05vQURCbEFqRUE3TTJwSzhRUFRrSGs1bzZ0CmdnampZdjBLV1BUajRKUTAwU3RjR0xqa1g3SU1iNC9HdXpYRkQ4czZDOEd3NmpwMEFqQW1Xa2JROTVsMzlnUGQKR2pjUjBSQURaT0dYb0NPQURwOE5lSzhBL2dKdWdnR0ZINHZYZ2l1ODJsQm5MOEZSc09jPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg=="}}},"kind":"rekord"}
Note that the the "kind"
field is at the end of the json in the API and in second place in the canonicalized format.
Thanks for the thorough triage @gaetanww!
I won't speak too much on this as I haven't tried this out myself, but I suspect that the issue you're seeing regarding canonicalization may be related to the canonicalizer being used. This implementation currently uses olpc_cjson, while Rekor uses RFC 8785 canonicalization. (OLPC canonical JSON is regrettably distinct from RFC 8785 canonical JSON.) In the sign
module, we've used json-syntax
to canonicalize the LogEntry body in accordance with RFC 8785 rules.
np! Thanks for the pointer, yes that could be it. I think the following code should work (using the json_syntax
crate with features ["canonicalize", "serde_json"]
). I won't have time to try it out today though.
pub fn verify_inclusion(&self, rekor_key: &CosignVerificationKey) -> Result<(), SigstoreError> {
self.verification
.inclusion_proof
.as_ref()
.ok_or(UnexpectedError("missing inclusion proof".to_string()))
.and_then(|proof| {
// encode as canonical JSON
let mut json_value = json_syntax::Value::from_serde_json(serde_json::to_value(&self.body)?);
json_value.canonicalize();
let encoded_entry = json_value.compact_print().to_string().into_bytes();
proof.verify(&encoded_entry, rekor_key)
})
}
I hopefully should be able to take a closer look at this some time next week, and apply the fix/update the PR.
One more data point on issue above. It works perfectly without modification with index: 71844338
, however, my hand-rolled merkle tree verification now fails. My hand-rolled implementation is just using the b64 decoded body
in the json, so the underlying issue might not be canonicalization.
I haven't looked at it in a few months, but if I remember correctly there are two types of indices. With one being the index in the log and one being the index in the tree. So you might be using the incorrect index.
@gaetanww is your issue resolved?
Sorry for the slow response. I made it work for the latest rekor records (e.g., index 71844338
), the early one (25579
) still fails AFAICT.
I did some investigating, I think @gaetanww your original analysis was correct and the formatting of the log entry is the issue, as both serializers produce identical outputs for me.
The Go client also uses the Base64 encoded entry provided by the log, is this the correct way to handle this? I feel like this might conflict with the description on how SETs are supposed to be verified.
Yes, the documentation does conflict with the actual definition. There should at least be a comment in the go code to explain why it's implemented that way. I think we should raise the point with the maintainers of the cosign project. I'll open an issue.
This is failing on 32bit targets because of the use of usize
for bitwise arithmetic. Changing to u64
fixes it.
Howdy 👋 is there someone in the community equipped to adequately review this MR? We at 1Password would be happy to review elements of this work but the core team that has directed energy towards this project doesn't have previous experience with merkle tree proof implementations.
@tannaurus I'm going to look at the merkle tree proof as I previously did a lot of work on a different client there. However I would need someone else to do the rust review.
This is failing on 32bit targets because of the use of
usize
for bitwise arithmetic. Changing tou64
fixes it.
@exFalso
I will update it to u64
, it seems like RFC6962 also prescribes a uint64
.
Sorry my ability to read rust is bad.
I'm a little confused by the edits in rekor/models -- those seem like generated files? But the added code doesn't seem generated? Maybe I'm just reading rust wrong.
The inclusion proof looks like its following the spec in 6962, there's a slightly different implementation in 9162 -- I don't know if it matters but it's not recursive.
It might be valuable to add some tests on the rekor inclusion proof itself (rather than the sub-functions):
one random example (from the java client):
"inclusionProof": {
"hashes": [
"810320ec3029914695826d60133c67021f66ee0cfb09a6f79eb267ed9f55de2c",
"67e9d9f66f0ad388f7e1a20991e9a2ae3efad5cbf281e8b3d2aaf1ef99a4618c",
"16a106400c53465f6e18c2475df6ba889ca30f5667bacf32b1a5661f14a5080c",
"b4439e8d71edbc96271723cb7a969dd725e23e73d139361864a62ed76ce8dc11",
"49b3e90806c7b63b5a86f5748e3ecb7d264ea0828eb74a45bc1a2cd7962408e8",
"5059ad9b48fa50bd9adcbff0dd81c5a0dcb60f37e0716e723a33805a464f72f8",
"6c2ce64219799e61d72996884eee9e19fb906e4d7fa04b71625fde4108f21762",
"784f79c817abb78db3ae99b6c1ede640470bf4bb678673a05bf3a6b50aaaddd6",
"c6d92ebf4e10cdba500ca410166cd0a8d8b312154d2f45bc4292d63dea6112f6",
"1768732027401f6718b0df7769e2803127cfc099eb130a8ed7d913218f6a65f6",
"0da021f68571b65e49e926e4c69024de3ac248a1319d254bc51a85a657b93c33",
"bc8cf0c8497d5c24841de0c9bef598ec99bbd59d9538d58568340646fe289e9a",
"be328fa737b8fa9461850b8034250f237ff5b0b590b9468e6223968df294872b",
"6f06f4025d0346f04830352b23f65c8cd9e3ce4b8cb899877c35282521ddaf85"
],
"logIndex": 1227,
"rootHash": "effa4fa4575f72829016a64e584441203de533212f9470d63a56d1992e73465d",
"treeSize": 14358,
"checkpoint": "rekor.sigstage.dev - 108574341321668964\n14358\n7/pPpFdfcoKQFqZOWERBID3lMyEvlHDWOlbRmS5zRl0=\n\n— rekor.sigstage.dev 0y8wozBFAiB8OkuzdwlL6/rDEu2CsIfqmesaH/KLfmIMvlH3YTdIYgIhAPFZeXK6+b0vbWy4GSU/YZxiTpFrrzjsVOShN4LlPdZb\n"
},
and then another failure test where you mangle this data a bit.
@loosebazooka
I'm a little confused by the edits in rekor/models -- those seem like generated files? But the added code doesn't seem generated? Maybe I'm just reading rust wrong.
Yeah afaik it's a mix of auto-generated files from the OpenAPI spec which was then modified by hand.
The inclusion proof looks like its following the spec in 6962, there's a slightly different implementation in 9162 -- I don't know if it matters but it's not recursive.
If my understanding is correct the semantics of the Merkle trees between 6962 and 9162 are identical. Also the Go cosign implementation uses an implementation based on RFC 6962.
It might be valuable to add some tests on the rekor inclusion proof itself (rather than the sub-functions):
one random example (from the java client):
and then another failure test where you mangle this data a bit.
Good point! I added more tests based on Rekor responses to the code.
The inclusion proofs seem to work as expected, but I'm having issues with the consistency proofs. The doctest example fails as of right now. I'm pretty sure it used to work in the past, but I'm not 100% certain. I added a test that to my understanding should be accepted, but it fails.
Did something on the Rekor side change?
I tracked down the issue. The following sequence of events triggers it:
root_hash_1
.root_hash_2
.root_hash_1
is requested, response has root_hash_2
(as it is the root hash of the tree at the time the proof was requested, this is the specified behavior).root_hash_2
instead of using root_hash_1
.This is a just bug in my implementation.
I pushed a fix that should address the issue. I still want to improve the tests for the consistency proofs.
I'm a little confused by the edits in rekor/models -- those seem like generated files? But the added code doesn't seem generated? Maybe I'm just reading rust wrong. Yeah afaik it's a mix of auto-generated files from the OpenAPI spec which was then modified by hand.
We haven't integrated it into the rest of sigstore-rs
yet, but I'm flagging our fully autogenerated OpenAPI client for later if anyone would like to try using it :)
Summary
Related to: #283
This adds implementations for:
The Merkle proofs are essentially ports of the transparency-dev implementations, including the test suite. The checkpoint related code is based on the implementation in the
rekor
Go package.I also changed the Rekor models to use the new
SignedCheckpoint
type, which implements theserde
traits. For now I think this is the only major breaking change, the rest are mostly new or private APIs.I have not implemented the logic to verify that Checkpoints and the corresponding consistency/inclusion proof are sound together. I want to discuss how to handle this here.
Release Note
SigstoreError
enumCheckpoint
type to handle verification of checkpoints/STHssigned_tree_head
fiekd fromString
toCheckpoint
checkpoint: Option<Checkpoint>
toInclusionProof
structDocumentation
LogInfo
andLogEntry
due to changes to struct fields, this should only require minor changesTodos: