scylladb / scylla-machine-image

Apache License 2.0
19 stars 26 forks source link

[ami,gce]: reduce snapshot size to minimum allowed #491

Closed yaronkaikov closed 8 months ago

yaronkaikov commented 10 months ago

Every AMI we create today holds a 30Gb EBS snapshot, if we take into account the fact that we copy those images to other regions and have multiple images (dev, debug, releases) it is adding up to our cost.

This commit reduces the size of the snapshot to 8Gb (it's the minimum value allowed for a snapshot)

In addition, it's reducing also the time for the build process from 30 minutes to ~12 minutes (which is also good)

Refs: https://github.com/scylladb/scylla-pkg/issues/3712

yaronkaikov commented 10 months ago

fixed typos in commit message :-)

benipeled commented 10 months ago

This change might cause disk-space issues due to grow-log-files etc. on the OS level, Please make sure this change is fully verified with QA tests (both tier1 and tier2) Other than that, lgtm

yaronkaikov commented 10 months ago

This change might cause disk-space issues due to grow-log-files etc. on the OS level, Please make sure this change is fully verified with QA tests (both tier1 and tier2) Other than that, lgtm

Running https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/longevity/job/longevity-100gb-4h-test/32/console, the log shows that the instance is with 30Gb root disk

12:11:00  root_disk_size_db: 30

So i assume it's good, will wait for @fruch to review as well to make sure

yaronkaikov commented 10 months ago

Verified with https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/ami/282/ and https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/ami/283/ which also trigger https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/longevity/job/longevity-100gb-4h-test/32/

fruch commented 10 months ago

what about the swap ? isn't that create on build time ? or it's not anymore ?

fruch commented 10 months ago

Also, I think SCT would be o.k. with that, but places like cloudformation and such, might have usability issue and should extend to pick bigger root disks.

fruch commented 10 months ago

Also what about GCP/AZure ? there we have smaller disks ?

yaronkaikov commented 10 months ago

what about the swap ? isn't that create on build time ? or it's not anymore ?

Swap creation including the calculation is been handled in https://github.com/scylladb/scylla-machine-image/blob/next/packer/scylla_install_image

yaronkaikov commented 10 months ago

Also what about GCP/AZure ? there we have smaller disks ?

No, they still using 30Gb. I will adjust it in a follow up patch

yaronkaikov commented 10 months ago

Also, I think SCT would be o.k. with that, but places like cloudformation and such, might have usability issue and should extend to pick bigger root disks.

Not sure if anyone is using the cloudformation, and even if someone is using it, shouldn't it be for POC or testing/evaluating Scylla?

yaronkaikov commented 10 months ago

@fruch Added another commit with the clenaup

yaronkaikov commented 10 months ago

Also what about GCP/AZure ? there we have smaller disks ?

No, they still using 30Gb. I will adjust it in a follow up patch

Only GCE has this configuration, trying to do it in the PR to see if it's working

yaronkaikov commented 10 months ago

@fruch Added also GCE with minimal disk size (10Gb) - verified with https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/gce-image/95/ and also with https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/longevity/job/longevity-10gb-3h-gce-test/22/

fruch commented 10 months ago

what about the swap ? isn't that create on build time ? or it's not anymore ?

Swap creation including the calculation is been handled in https://github.com/scylladb/scylla-machine-image/blob/next/packer/scylla_install_image

that was the reason for enlarging the root disk in the fist place, so we can create the swap upfront. if we don't do that anymore, should be an issue to resize it back to minimum size

syuu1228 commented 10 months ago

Every AMI we create today holds a 30Gb EBS snapshot, if we take into account the fact that we copy those images to other regions and have multiple images (dev, debug, releases) it is adding up to our cost.

This commit reduces the size of the snapshot to 8Gb (it's the minimum value allowed for a snapshot)

Why we incleased rootfs size from 10GB to 30GB was because there was not enough disk space for the swapfile (c22807f53309b5ee0bdca93492bd8c3b408a01a5). Reducing image size may work, but our setup script automatically shrink swap size to 'half of diskfree'.

(Recommended swap size for Scylla is either total_mem/3 or 16GB - lower of the two, so 10GB of rootfs is almost always not enough)

I think we probably should move swapfile to /var/lib/scylla (data volume) instead of rootfs, then we will have enough space without enlarge rootfs . We already doing similar thing in Azure, Azure has "Resouce Disk" which is the disk just for swapfile, so we allocate swapfile there.

roydahan commented 9 months ago

We can have the AMI set to 10GB and in the documentation or anywhere it's used (terraform?) we ask to set with bigger disk. However, this must be coordinated with product and in all other places that uses it.

yaronkaikov commented 9 months ago

Verified https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/next-machine-image/257/ (including rebased)

yaronkaikov commented 9 months ago

We can have the AMI set to 10GB and in the documentation or anywhere it's used (terraform?) we ask to set with bigger disk. However, this must be coordinated with product and in all other places that uses it.

@roydahan When someone wants to set an instance based on our images, we must set the size of the root disk anyway. So I don't think we should have any impact on that.

@tzach Any reason not to reduce the size to the minimum?

yaronkaikov commented 9 months ago

@benipeled @Annamikhlin Please review

benipeled commented 9 months ago

@benipeled @Annamikhlin Please review

Nothing changed since my last review - https://github.com/scylladb/scylla-machine-image/pull/491#issuecomment-1840379162 - if it's verified by SCT and the swap change handled, go ahead

roydahan commented 9 months ago

I tried to look in our documentation and I couldn't find anyware in AMI launch for scylladb, explaination on how much to configure for root FS (swap but also logs and other things). This is what I found: https://opensource.docs.scylladb.com/stable/operating-scylla/procedures/cluster-management/ec2-dc.html https://opensource.docs.scylladb.com/stable/getting-started/install-scylla/launch-on-aws.html https://www.scylladb.com/product/release-notes/scylla-ami-for-aws/

yaronkaikov commented 9 months ago

I tried to look in our documentation and I couldn't find anyware in AMI launch for scylladb, explaination on how much to configure for root FS (swap but also logs and other things). This is what I found: https://opensource.docs.scylladb.com/stable/operating-scylla/procedures/cluster-management/ec2-dc.html https://opensource.docs.scylladb.com/stable/getting-started/install-scylla/launch-on-aws.html https://www.scylladb.com/product/release-notes/scylla-ami-for-aws/

I am not sure we have such

we do have https://opensource.docs.scylladb.com/stable/kb/set-up-swap.html . so i am not sure why not set it during scylla_image_setup. it shouldn't take long

tzach commented 9 months ago

@tzach Any reason not to reduce the size to the minimum?

Not that I'm aware of. Primarily, we need to run all standard tests with this image.

roydahan commented 9 months ago

Not that I'm aware of. Primarily, we need to run all standard tests with this image.

It won't matter because the tests are configuring it with the required root FS size that will have enough space for swap. The product question here is about the fact that we won't have the minimum size by default and one needs to set it correctly when deploying from our AMI.

TBH, I think it's a change that confuse users and may break some users automations.

tzach commented 9 months ago

It won't matter because the tests are configuring it with the required root FS size that will have enough space for swap.

@roydahan Does it make sense to do this automatically for all users?

I'm not comfortable updating production image settings without testing!

fruch commented 9 months ago

It won't matter because the tests are configuring it with the required root FS size that will have enough space for swap.

@roydahan Does it make sense to do this automatically for all users?

the only why you might force people to set bigger root disks, if the image root disk is bigger so if they set it too small, it would fails to create an instance

you can automate in any other way, if you are giving them ami-id.

now it's 30Gb, and no on can create something smaller and the by product of it, that all of our images created take 30Gb of storage, and that's why yaron is trying to make it smaller

it's a UX question, cause now if user were counting someone on their process on the default size to o.k. for them and with the swap size that comes default, now on the next release they might get something else out of the box, that won't have enough swap.

all other users that specific the sizes of root-disks regardless, won't notice it. (they already use >=30Gb now) I think also scylla-cloud those that, and doesn't rely on the default

I'm not comfortable updating production image settings without testing!

It would be test once it hit master, but with tools that all specify the root disk size with >=30Gb

roydahan commented 9 months ago

TBH I think this change doesn't worth the pain and bad UX for fresh install. I suggest we won't do this for now.

mykaul commented 9 months ago

TBH I think this change doesn't worth the pain and bad UX for fresh install. I suggest we won't do this for now.

It is thousands of dollars on EBS costs, btw. If not now - when?

roydahan commented 9 months ago

TBH I think this change doesn't worth the pain and bad UX for fresh install. I suggest we won't do this for now.

It is thousands of dollars on EBS costs, btw. If not now - when?

It's thousands of dollars if you don't delete the snapshots, but AFAIU we do now, and won't save more than 180 days. So, these extra 22GB will cost us 1.76$ per month, per AMI (0.08$ * 22GB). Assuming we have around 1 AMI per day in avg, it shouldn't cost thousands of dollars, probably few hundreds.

I think it's worth the user experience of launching our AMI with the defaults value and scylla fails to setup since there is not enough disk space to configure swap.

We're spending much much more money on instances that people don't terminate and live for months...

mykaul commented 9 months ago

TBH I think this change doesn't worth the pain and bad UX for fresh install. I suggest we won't do this for now.

It is thousands of dollars on EBS costs, btw. If not now - when?

It's thousands of dollars if you don't delete the snapshots, but AFAIU we do now, and won't save more than 180 days. So, these extra 22GB will cost us 1.76$ per month, per AMI (0.08$ * 22GB). Assuming we have around 1 AMI per day in avg, it shouldn't cost thousands of dollars, probably few hundreds.

Just 1 AMI per day would be great - but it's per region, no?

I think it's worth the user experience of launching our AMI with the defaults value and scylla fails to setup since there is not enough disk space to configure swap.

We can work on improving the user experience too. Those 30GB may or may not be enough, depending on the instance type anyway, no?

We're spending much much more money on instances that people don't terminate and live for months...

We'll get there too. The fact we have instances running >1 week that are not production ones - should be one of our next item. We'll also reduce container images (SCT's for example). Every minute run counts. Eventually, things add up.

yaronkaikov commented 9 months ago

TBH I think this change doesn't worth the pain and bad UX for fresh install. I suggest we won't do this for now.

It is thousands of dollars on EBS costs, btw. If not now - when?

It's thousands of dollars if you don't delete the snapshots, but AFAIU we do now, and won't save more than 180 days. So, these extra 22GB will cost us 1.76$ per month, per AMI (0.08$ * 22GB). Assuming we have around 1 AMI per day in avg, it shouldn't cost thousands of dollars, probably few hundreds.

Just 1 AMI per day would be great - but it's per region, no?

Actually, we are creating many more AMI's , during December we created 173 AMI's in us-east-1 only , in other testing regions such as eu-west- we have 111 AMIs

Some of them for debug and some are official releases/master

I think it's worth the user experience of launching our AMI with the defaults value and scylla fails to setup since there is not enough disk space to configure swap.

We can work on improving the user experience too. Those 30GB may or may not be enough, depending on the instance type anyway, no?

We're spending much much more money on instances that people don't terminate and live for months...

We'll get there too. The fact we have instances running >1 week that are not production ones - should be one of our next item. We'll also reduce container images (SCT's for example). Every minute run counts. Eventually, things add up.

roydahan commented 8 months ago

Reducing the AMI size saves only 500$ in a course of 6 months, calculating 2 AMIs for master and 2 AMIs for enterprise. It's less than 0.4% of our costs in 6 moths period.

There are other places where we can save for snapshots, taking them in this task: https://github.com/scylladb/scylla-pkg/issues/3712