Our qcow2 image is much bigger than it should be

nyh commented 10 years ago

Pekka noticed that our default build image (with Java, Jruby, etc.) is 400 MB in size, while its actual content is only 200 MB. I guessed that during our image build (which involves running OSv itself to write the files on the ZFS filesystem), for some reason ZFS is writing more than it should (for some journal, temporary storage, or something) which doesn't get cleared up, causing our qcow2 image to grow unnecessarily.

In Jan 27 2014, Raphael wrote a patch "image: Save disk space by disabling ZIL on upload manifest phase" which supposedly solves most of this problem. I didn't see any serious objection to this patch, but on the other hand not much enthusiasm either (I don't know why) and it was never accepted. We should reconsider this patch, or explain why it was rejected.

Avi Kivity also suggested maybe the problem can be solved by enabling "TRIM/DISCARD support". Dor opened an issue for this (#207).

raphaelsc commented 10 years ago

Replying the reason of the patch "image: Save disk space by disabling ZIL on upload manifest phase" being rejected:

Currently, ZFS doesn't provide a flag that completely disables the ZIL functionality anymore (although the same goal can be achieved now by setting the sync property of the pool to always), but instead it does provide one called zil_replay_disable which will delete the ZIL content left over by the previous instance at the mount time, consequently discarding any replay. My patch was using the latter one.

I really cannot understand how ZIL is being carried over (not completely sure yet) given that both mkfs and cpiod do an explicit sync(). It comes into my mind that when ZIL is replayed at the mkfs instance, the image does inflate to that size, and deflates to 200MB afterwards. I could also say that /zfs/zfs and /zfs not being unmounted (at the time of the report) might be the reason of the overall problem. Who knows?! By the way, I sent a patch to fix this problem, and it was committed few hours ago.

Follow the commit: https://github.com/cloudius-systems/osv/commit/8e31549c3aa65b8f7c67d6770d60f1cff7c768dd.

Before/after the commit: usr.img size is really good when building OSv with 'make' (about 157MB).

dorlaor commented 10 years ago

On Fri, Mar 28, 2014 at 5:08 AM, Raphael S.Carvalho < notifications@github.com> wrote:

Replying the reason of my patch being rejected: Currently, ZFS doesn't provide a flag that completely disables ZIL anymore (though the same goal can be achieved now by setting the sync property of the pool to always), but instead a flag called zil_replay_disable which will delete the ZIL content left over by the previous instance. I really cannot understand how ZIL is being carried over given that both mkfs and cpiod do an explicit sync(). Then it comes into my mind that when ZIL is replayed from mkfs, the image does inflate to that size, and then deflates to 200MB. I could actually say

qcow2 image can't shrink just like that. Some action needs to be done in order to do that, not even sure it has trim support. If blocks were allocated remote offset in the file they will stay allocated even if the guest fs will remove them. Need to go over qemu-img or other capabilities to enumerate the active block list.

that /zfs/zfs and /zfs not being unmounted might be the reason, I just sent a patch to fix that few hours ago, who knows? The respective commit: 8e31549https://github.com/cloudius-systems/osv/commit/8e31549c3aa65b8f7c67d6770d60f1cff7c768dd. Could you please Nadav check this issue again with the above commit applied?

Reply to this email directly or view it on GitHubhttps://github.com/cloudius-systems/osv/issues/211#issuecomment-38882553 .

wkozaczuk commented 5 years ago

Is it still true that "our qcow2 image is much bigger than it should be"? Is this issue still relevant enough to keep it open?

cloudius-systems / osv

Our qcow2 image is much bigger than it should be #211