TritonDataCenter / triton

Triton DataCenter: a cloud management platform with first class support for containers.
https://www.tritondatacenter.com/
Mozilla Public License 2.0
1.31k stars 181 forks source link

TRITON-2307 Add support for AMD to KVM CoaL script #316

Closed siepkes closed 1 year ago

siepkes commented 2 years ago

Also replaced the use of SDC with Triton.

While this fixes the script for using it with AMD I currently can't boot the latest CoaL image on Fedora 35 with KVM. After the bootloader displays and "Booting..." appears the machine immediately resets. Booting a SmartOS ISO in the same VM boots just fine. This is something I'm looking into.

EDIT: Booting the normal (non-Coal) previous Triton version (usb-release-20220310-20220311T184919Z-g8b1795b-8gb) image works with KVM.

EDIT2: Booting the non-Coal Triton version also works (usb-release-20220505-20220505T205435Z-gf5ac1e4-8gb).

danmcd commented 2 years ago

Please detail your tests here. We'll also file a TRITON bug (and mark it public) to track it as well.

danmcd commented 2 years ago

https://smartos.org/bugview/TRITON-2307

siepkes commented 2 years ago

Please detail your tests here. We'll also file a TRITON bug (and mark it public) to track it as well.

I encountered the issue with coal-release-20220505-20220505T205435Z-gf5ac1e4-8gb.vmwarevm on KVM. I can't find the URL's to download older CoaL releases to test. If you could point me to an older release I could give it a test run. Basically as soon as you see the blue "Booting..." message (after the bootloader and the the 3 loading /os/202205.... lines) the VM just resets. If you let it run it just goes on and on in a loop. I also tried to fiddling a bit with using SATA or VirtIO controllers, different virtual chipsets, etc. but that didn't make difference.

The "normal" images such as usb-release-20220505-20220505T205435Z-gf5ac1e4-8gb.img boot correctly. Though Triton couldn't complete the setup with that image because the VM was apparently too slow to complete the setup in a timely matter (see screenshot below). I don't really have a frame of reference here in the sense that I don't know how long these steps take on VMWare. I used VirtIO for both the pool disk and USB devices. The pool was a raw file fully pre-allocated. 4 cores assigned to the VM of a AMD Ryzen 7 3700X 8-Core with 8GB of memory assigned to the VM (32GB total). So I wouldn't expect it to be slow. But that's probably yet another issue to look at ;-)

Is the CoaL image just a release with VMWare files such as the VMDK, etc.? Or is it actually a different build with some changes in it?

triton_kvm_fail_2

bahamat commented 2 years ago

@siepkes The link to download coal is in each release announcement.

https://smartdatacenter.topicbox.com/search:subject%3Arelease

I’d really like to figure out what’s going on and do what we can to make installation more dependable before approving.

siepkes commented 2 years ago

@bahamat Maybe I'm missing something but the release notes always link to the latest version, not a specific version? So coal-latest.tgz (which is a moving target) for example.

bahamat commented 2 years ago

@siepkes Ah, you're right. Sorry. You can use this:

https://us-east.manta.joyent.com/Joyent_Dev/public/manta-browse/browse.html

siepkes commented 2 years ago

Sorry for the delay. I updated my desktop to Fedora 36 (clean install). I was previously running Fedora 35. With Fedora 36 the coal images work. They no longer reset immediately after "booting...". No idea if the previous breakage was caused by an older KVM version or if my installation was somehow botched. For reference these are the relevant versions I'm currently on:

libvirt-daemon-kvm-8.1.0-2.fc36.x86_64
qemu-kvm-6.2.0-12.fc36.x86_64
qemu-kvm-core-6.2.0-12.fc36.x86_64

I was able to successfully install the 20220630 coal image in KVM.

The 20220310 and 20220505 coal images both failed in various stages of the post setup phase. I've included screenshots below for reference.

20220310 image:

202203_coal_install_fail

20220505 image cranked up VM to 16GB of RAM and 6 vcpu's (SMT is disabled on host):

202205_coal_install_fail