Closed yaronkaikov closed 1 year ago
I'm approving this now, but I think we should replace shell scripting techniques (like awk, grep, head...) to python code, I will send a patch for it later.
Or Ansible...
Do you need a reboot for this to take place?
Do you need a reboot for this to take place?
the next boot would be when someone would use this image :)
I don't think we need to restart as part of creation of the image
Do you need a reboot for this to take place?
yes, but it's during image creation anyway , so once you use the image you will get the right kernel
Verified with https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/ami/52/
[yaronkaikov@london]~/git/scylla-pkg/ansible (debug-new-servers)$ ssh scyllaadm@54.242.43.210
The authenticity of host '54.242.43.210 (54.242.43.210)' can't be established.
ED25519 key fingerprint is SHA256:H10isIs0kxplEiTQmBsG2cY8I1WM5aGuEkqfb3klnCk.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '54.242.43.210' (ED25519) to the list of known hosts.
Welcome to Ubuntu 22.04.2 LTS (GNU/Linux 5.15.0-1034-aws x86_64)
@fruch @syuu1228 Verification passed. any other comments?
Q: what will happen in the next apt-get update? Will we get the 5.19 kernel? Don't we need to do something like 'sudo apt remove linux-generic-hwe*' or something? (probably wrong package here!)
Q: what will happen in the next apt-get update? Will we get the 5.19 kernel? Don't we need to do something like 'sudo apt remove linux-generic-hwe*' or something? (probably wrong package here!)
@mykaul Looks like it's not needed
scyllaadm@ip-10-99-17-60:~$ sudo apt-get full-upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
scyllaadm@ip-10-99-17-60:~$ sudo apt-get update
Hit:1 http://us-east-1.ec2.archive.ubuntu.com/ubuntu jammy InRelease
Get:2 http://us-east-1.ec2.archive.ubuntu.com/ubuntu jammy-updates InRelease [119 kB]
Get:3 http://us-east-1.ec2.archive.ubuntu.com/ubuntu jammy-backports InRelease [108 kB]
Get:4 http://security.ubuntu.com/ubuntu jammy-security InRelease [110 kB]
Hit:5 https://downloads.scylladb.com/unstable/scylla/master/deb/unified/2023-04-15T03:03:29Z/scylladb-master stable InRelease
Get:6 http://us-east-1.ec2.archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages [1030 kB]
Get:7 http://us-east-1.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 Packages [816 kB]
Get:8 http://us-east-1.ec2.archive.ubuntu.com/ubuntu jammy-updates/universe amd64 Packages [902 kB]
Get:9 http://us-east-1.ec2.archive.ubuntu.com/ubuntu jammy-updates/multiverse amd64 Packages [24.1 kB]
Fetched 3109 kB in 1s (4417 kB/s)
Reading package lists... Done
doing upgrade and full-upgrade will not update the kernel
@yaronkaikov I realized that when newer aws-lts-22.04 kernel released, apt-get update && apt-get upgrade does not update saved entry. So saved entry works as specific kernel version pinning, users won't get newer lts kernel when the instance rebooted. Is this what we want, or we just want to keep using latest LTS kernel (and allow users to use newer lts kernel when the instance rebooted)?
If we want latter one, I think the solution is not kernel version pinning by grub, we need to drop linux-aws
, linux-headers-aws
and linux-image-aws
metapackages instead (also need to drop non-lts kernel if it already installed).
Something like this:
ubuntu@ip-10-0-1-199:~$ sudo apt purge linux-aws linux-headers-aws linux-image-aws
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages will be REMOVED:
linux-aws* linux-headers-aws* linux-image-aws*
0 upgraded, 0 newly installed, 3 to remove and 0 not upgraded.
After this operation, 36.9 kB disk space will be freed.
Do you want to continue? [Y/n] y
...
ubuntu@ip-10-0-1-199:~$ sudo apt update
...
ubuntu@ip-10-0-1-199:~$ sudo apt install linux-aws-lts-22.04
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
linux-aws-headers-5.15.0-1034 linux-headers-5.15.0-1034-aws
linux-headers-aws-lts-22.04 linux-image-5.15.0-1034-aws
linux-image-aws-lts-22.04 linux-modules-5.15.0-1034-aws
Suggested packages:
fdutils linux-aws-doc-5.15.0 | linux-aws-source-5.15.0 linux-aws-tools
The following NEW packages will be installed:
linux-aws-headers-5.15.0-1034 linux-aws-lts-22.04
linux-headers-5.15.0-1034-aws linux-headers-aws-lts-22.04
linux-image-5.15.0-1034-aws linux-image-aws-lts-22.04
linux-modules-5.15.0-1034-aws
0 upgraded, 7 newly installed, 0 to remove and 86 not upgraded.
Need to get 49.6 MB of archives.
After this operation, 239 MB of additional disk space will be used.
Do you want to continue? [Y/n] y
ubuntu@ip-10-0-1-199:~$ sudo apt upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
The following packages will be upgraded:
apparmor apport apt base-files binutils binutils-common
binutils-x86-64-linux-gnu ca-certificates cloud-init curl distro-info-data
fwupd-signed grub-common grub-efi-amd64-bin grub-efi-amd64-signed grub-pc
grub-pc-bin grub2-common initramfs-tools initramfs-tools-bin
initramfs-tools-core intel-microcode isc-dhcp-client isc-dhcp-common
libapparmor1 libapt-pkg6.0 libbinutils libbpf0 libctf-nobfd0 libctf0
libcurl4 libgnutls30 libgssapi-krb5-2 libk5crypto3 libkrb5-3 libkrb5support0
libksba8 libldap-2.5-0 libldap-common libnetplan0 libnss-systemd libnss3
libpam-modules libpam-modules-bin libpam-runtime libpam-systemd libpam0g
libpython3.10-minimal libpython3.10-stdlib libsasl2-2 libsasl2-modules
libsasl2-modules-db libssl3 libsystemd0 libudev1 motd-news-config netplan.io
openssh-client openssh-server openssh-sftp-server openssl python-apt-common
python3-apport python3-apt python3-distupgrade python3-pkg-resources
python3-problem-report python3-setuptools python3-tz python3-update-manager
python3.10 python3.10-minimal shim-signed snapd sudo systemd
systemd-hwe-hwdb systemd-sysv tar tzdata ubuntu-advantage-tools
ubuntu-release-upgrader-core udev update-manager-core update-notifier-common
xxd
86 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
37 standard LTS security updates
Need to get 69.1 MB of archives.
After this operation, 13.0 MB of additional disk space will be used.
Do you want to continue? [Y/n]
After metapackage changes, no non-lts kernel (5.19.x) offered by apt-get upgrade.
I think @syuu1228 has explained above exactly what I was asking previously - and the solution - to use the LTS packages.
Verification: AMI: https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/ami/65/
GCP: https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/gce-image/25/
Azure: https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/azure-image/131/
Next-machine-image: https://jenkins.scylladb.com/job/scylla-master/job/releng-testing/job/next-machine-image/88/
Azure failed on artifact but it's not related to this PR, it's failing for a while, and verified in the logs the kernel is the LTS
Is that OK?
2023-04-29T17:05:51+00:00 longevity-lwt-3h-2023-1-db-node-8f772d9e-3 !NOTICE | kernel: Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-1035-aws root=PARTUUID=fe426c9e-119b-4a8b-a44c-9da082a00899 ro console=tty1 console=ttyS0 nvme_core.io_timeout=4294967295 net.ifnames=0 clocksource=tsc tsc=reliable panic=-1
2023-04-29T17:05:51+00:00 longevity-lwt-3h-2023-1-db-node-8f772d9e-3 !NOTICE | kernel: Unknown kernel command line parameters "BOOT_IMAGE=/boot/vmlinuz-5.15.0-1035-aws", will be passed to user space.
During image creation, we are running `apt-get full-upgrade which also updates the kernel (added as part of https://github.com/scylladb/scylla-machine-image/commit/90340275b80a3a54dcfc1e5ec660481ba167d1c3),
Since we want to use the LTS kernel version only, adding the kernel removal package and installation before we run
scylla_install_image
Currently only AWS and Azure have LTS kernel for 22.04, once GCP will have it as well we should add it as well
Ref: https://github.com/scylladb/scylladb/issues/13560