Closed jmattsson closed 4 years ago
TLDR: this is a serious issue, a fixed AMI release will be available within a few hours. I will post here more details after it's fixed.
The AMD support is now built in, AMI with the fixed kernel 5.3.12 is Devuan Runit 2019-11-23 (Unofficial). I tested it on t3a.micro (AMD EPYC 7571), it boots fine.
Until recently, everything on AWS was based on Intel. That allowed me to optimize the kernel for Intel platform, with the risk that if Amazon decided to offer AMD based instance types, it won't boot. And that's exactly what happened now. I am sorry I let you run against the wall here. The fixed kernel is generic and runs on both Intel and AMD. Note, some more exotic CPUs are still disabled: Hygon, Centaur, Zhaoxin.
The main goal of this project is to offer a stable base OS on EC2. I expect it to work well on all instance types. Stability and compatibility are more important than speed. Thank you for reporting this issue!
Thank you for maintaining this AMI - it's kind of my go-to at this point. Nice and minimal without much cruft to be removed :)
Did you try doing something a bit more CPU intensive? I was able to boot fine with the Intel-only kernel, but once I started e.g. compiling ZFS everything locked up after a little while.
Should I mention that you can even get ARM instances on EC2 these days? They're the a1
family. And no, I haven't tried them out.
Nice to hear you find my distro useful. I made it this way - minimal, no bloat, and yet with all the usual tools preinstalled. Kind of minimal "batteries included" general purpose OS.
I am doing something CPU intensive regularly - I compile the kernel on a EC2 instance. That being said, I never compiled ZFS, but I experienced lockups before - in my case this happened when PHP exhausted all RAM. This is not an issue specific to my distro, it is a Linux "feature". There are several ways to address this:
1) Use instance type with more RAM 1) Mount some SWAP space. You can enable swapfile autorun to create swap on instance start 1) check out OOMD, it is preinstalled. NOTE: this is a last-resort solution. It might kill some processes, but at least your instance will survive.
So far, I only maintain the kernel for x86 platform, and have no intention to support other platforms like ARM.
Somewhat tangentially, do you have the actual AMI build scripts available somewhere? I'd love to see how you've automated the whole thing (as I may need to do something similar).
You got me. I am very open about what I do here, but this setup is pretty much the only thing that I didn't published. The reason: this AMI build is being done by scripts that are very specific to my release and would be useless to anybody else in this form. That being said, there are a few quirks that I had to solve, so I will probably publish some interesting code snippets later. I am working on a new website for that purpose. Big parts of what is required is already public in my repos.
The process itself is very simple. One shell script that uses awscli
and a few jq
tricks to build a new updated AMI from the latest published image, in a rolling release manner. The whole build is done automatically within AWS. No other special tools for AMI building like VirtualBox or Packer are used. The idea is:
sin kernel
, with enabled CLEANUP and SHARE in /etc/default/kernel-updateapt-get full-upgrade
sin install sin ec2-tools hiawatha nettools oomd runit-init
sin pull kernel
from the instance that compiled it (see PULL_FROM in kernel)reboot
and run ec2-instance-bootstrap.shshutdown
Sample awscli
+ jq
usage:
# FUNCTION: info about INSTANCE
get_instance_info () {
INSTANCE_INFO=$(aws ec2 describe-instances --profile default --instance-ids "$1" \
--query 'Reservations[*].Instances[*].{State:State.Name,PublicIP:PublicIpAddress}' 2>/dev/null)
instance_state=$(printf '%s' "$INSTANCE_INFO" | jq -r '.[][].State')
instance_publicip=$(printf '%s' "$INSTANCE_INFO" | jq -r '.[][].PublicIP')
}
# FUNCTION: info about AMI
get_image_state () {
image_state=$(aws ec2 describe-images --profile default --image-ids "$1" 2>/dev/null |
jq -r '.Images[0].State')
}
Fantastic, thank you for sharing!
Hello,
I tried to use this AMI in a t3a instance, but discovered that was Not A Good Idea(tm). I quickly encountered hard lockups, and the system log showed me the kernel had "oopsed", and also that it doesn't seem to have AMD support built-in:
May I suggest either adding in AMD support, or mentioning in the readme that AMD is a no-go? Thanks!