tytso / e2fsprogs

Ext2/3/4 file system utilities
http://ext4.wiki.kernel.org
373 stars 219 forks source link

Set fixed times for inode to make sure that we can have reproducible builds #164

Closed uglym8 closed 4 months ago

uglym8 commented 10 months ago

There's a small issue that has to be fixed if you'd like to be able to create a reproducible ext4 FS from the source directory. By default, https://reproducible-builds.org/docs/system-images/ suggests you:

Instead of using mkfs.ext, make_ext4fs can be used. make_ext4fs is creating the whole filesystem at once.

Yes, make_ext4fs can create a reproducible FSes, but it lacks the proper hard links support (I'm about to send patches that fix this issue as well, so in the end everyone has a choice to use either mkfs.ext4 or make_ext4fs, but the latter to the best of my knowledge has been phased out ).

You can try to test these changes by first trying to create a reproducible FS from the stand alone structure 2 times in a row and compare SHA256 hashes of the result image (or block device, whatever you choose to use):

$ E2FSPROGS_FAKE_TIME=0x4f5b6f7c e2fsprogs/build/misc/mke2fs -E hash_seed=440aa176-97fa-475a-8209-d96eff4a0c90 -t ext4 -b 4K -U clear 576M_3.3.16.0.bin -d 3.3.16.0.proper.var`

You'll notice that the result images will differ.

Then apply this tiny change and repeat, this time you can see that both images are identical.

P.S. There might be cases (say, by using different options) that could still lead to FS being not reproducible. In that case I suggest to handle them on the need-to-do basis (once the problem is reported).

zokier commented 6 months ago

If I'm understanding the situation correctly, the problem you are facing might be mke2fs changing atimes of files when it reads them from source directory, and when you rebuild the FS the timestamps differ? The simple workaround might be to run touch to set timestamps in the source directory to fixed values before running mke2fs. With this in mind I'm not sure if this is something that should be addressed in e2fsprogs, considering that the FS build process as I understand it is already reproducible as long as the input stays fixed.

That all being said, I suspect that #118 will be the best way forwards to get reproducible results in the long term.

tytso commented 4 months ago

SOURCE_DATE_EPOCH has been implementing with timestamp clamping in v1.47.1-rc1+ which should provide the functionality that you are looking for.