Open lukegb opened 3 years ago
Some crumbs from #nixos-infra
[12:15:03] <samueldr> same build
[12:15:03] <samueldr> https://gist.github.com/samueldr/67534945d56489a6747acdbd4223a5be
[12:15:06] <samueldr> ran 20 times
[12:15:22] <samueldr> only difference is a comment in the script with #${toString builtins.currentTime}
[12:15:51] <samueldr> we have three builds that look more "normal" comparing with x86_64 equivalent builds
[12:20:52] <lukegb> samueldr: does the same thing happen with x86_64?
[12:21:08] <samueldr> I didn't run in loop
[12:21:10] <samueldr> but out of 5 local builds
[12:21:21] <samueldr> looks basically the same, few bytes difference in the result
[12:21:31] <samueldr> not half a gigabyte
[12:22:18] <samueldr> so I'm really thinking that something on aarch64 acts just different enough to sometimes cause... weirdness?
[12:24:48] <samueldr> that inconsistency is troubling
[12:29:00] <gchristensen> very
[12:29:00] <samueldr> restarted with discard param
[12:29:39] <samueldr> also, 4 out of 20 times, that's around 80% of the time doing the "weird" thing
[12:29:53] <samueldr> at the very least it's easier to get a good feeling that things are going right
TLDR:
On x86_64
the output of qemu-img
(the vhd file) is always "pretty well resized".
On aarch64
, about 20% of the time it will resize "well", like we expect, but the rest of the time it barely sizes down.
This was tested on the community builder, by adding a # ${toString builtins.currentTime}
comment in the postVM attribute, so right after the VM ran, normally this means the build shouldn't be impacted in any major way.
It has yet to be explored why sometimes qemu-img
fails to... sparsify?? the image. It does not look like it's about filesystem discard, as setting the proper config for the VM and running e2fsck
to discard things won't help.
If I were to look right now, at the issue, I would try getting a raw disk image that fails to sparsify well, and manually run qemu-img on it a couple of times to see if the results change.
Describe the bug For aarch64 (only, it seems), the amazonImage exceeds the output size limit on Hydra: https://hydra.nixos.org/build/142425973
This is channel-blocking.
Notify maintainers @samueldr @grahamc
Metadata Please run
nix-shell -p nix-info --run "nix-info -m"
and paste the result.Maintainer information: