monzo / aws-nitro-util

Utilities to reproducibly build images for AWS Nitro Enclaves
MIT License
26 stars 3 forks source link

EIF bit-by-bit reproducibility vs Nitro Util #19

Closed roshanr95 closed 3 months ago

roshanr95 commented 3 months ago

The README says "no, EIFs are timestamped" against nitro-cli for reproducability which might be worth clarifying. The nitro-cli tool itself produces reproducible builds given the same docker image (on the same system for sure, probably over different systems as well). It's the Docker images themselves that are usually timestamped.

cottand commented 3 months ago

Hi @roshanr95 !

Docker images definitely tend to be timestamped and a source of indeterminism, but I do believe the Nitro CLI sets another timestamp outside of the image's.

My understanding of where the timestamp comes from is from reading the encalve_build folder in the Nitro CLI source. Specifically, I see a timestamp included in the EifBuildInfo struct. I believe that is constructed here:

let build_info = generate_build_info!(&format!("{}.config", kernel_path)).map_err(|e| {

And then that function uses a new timestamp each time it is called, as seen in enclave-format, inside generate_build_info():

let now = Utc::now();

Ok(EifBuildInfo {
    build_time: now.to_rfc3339(),
})

More empirically, when we developed this repo, we had to set the timestamp manually here in order to override what generate_build_info() produced. Additionally, when we did use the Nitro CLI, if I remember correctly we got EIFs with different hashes even when using the exact same Docker image twice.

I am by no means a Rust developer, so I could have misread and the timestamps in EIFs could come from elsewhere. Do correct if this is the case! I would hate for the README to be inaccurate.

roshanr95 commented 3 months ago

hmm, when I do nitro-cli build-enclave --docker-uri <tag> --output-file <eif>, I always get the same PCRs on repeated attempts, is that not the same for you? I'm now wondering why that is the case when the code seems to clearly be using timestamps in the metadata :sweat_smile:

cottand commented 3 months ago

Ah, that's because the PCRs do not include metadata (which is where the timestamp is). See this article for a good explanation. If you build the same EIF twice, you get the same PCRs, but not the same SHA256 of the file itself.

Image from that article further below (emphasis mine). See how EifSectionMetadata is not part of any PCR.

This repo, on the other hand, will actually produce EIF files with the same SHA256 every time (assuming you always use the same input for the copyToRoot part)

Screenshot 2024-06-17 at 16 37 34
cottand commented 3 months ago

This is why in the README we say "Bit-by-bit reproducible EIFs". Let me know if you think some different working might make that clearer

cottand commented 3 months ago

hey @roshanr95 , given the README was not inaccurate, I will be closing the issue now. Do let me know if you feel the explanation was not satisfactory or there is better wording we could have used. Thanks!

roshanr95 commented 3 months ago

Ah got it, I misunderstood.