Closed travisghansen closed 4 years ago
Notes I have (still evolving):
support full clone? how to allow full manipulation of all the opts?
sockets?
instructions for use?
strategy param?
proper value for onboot?
does proxmox need to be so sticky to a specific node? can we tell it to just deploy to the 'cluster' and let it go from there?
support updating the network card/bridge with clone mode
use cloud-init drive with rancher-os (for ssh key injection instead of hacky ssh commands)?
remove ssh password logic and associated code
protection flag on VMs
flag for citype
@lnxbil I consider this complete and ready for review. It should support all the previous cdrom
use-case along with many many improvements across the board on top of the obvious cloud-init
support.
I've ran many tests locally on my development machine and also created several clusters from rancher including nodes backed by Ubuntu, CentOS, and Rancher OS. In it's current state it appears to be robust enough to handle failure scenarios and other generally bad stuff (like a very overtaxed proxmox taking many hours to install etc).
Open to review/suggestions at this point.
Wow, thank you. Didn't had time to build and review the changes yet, but hopefully on the weekend.
So I had a better look at the code and also built an test environment. After reverting back to eth0
as the NIC naming scheme I got a little bit further, but not quite running. It stopped at:
&{[-F /dev/null -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none -o LogLevel=quiet -o PasswordAuthentication=no -o ServerAliveInterval=60 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null root@192.168.1.136 -o IdentitiesOnly=yes -i /Users/andreas/.docker/machine/machines/docker-clone/id_rsa -p 22] /usr/bin/ssh <nil>}
So, there is still something missing. Can you describe which what arguments you tried?
@lnxbil what are you using for the template image? My guess is that it's failing due to root
username there but can't be sure (and yes, eth0
is a requirement across the board currently...I considered adding an option to pick the nic name).
@lnxbil what are you using for the template image? My guess is that it's failing due to
root
username there but can't be sure (and yes,eth0
is a requirement across the board currently...I considered adding an option to pick the nic name).
The root
was my guess, because before that, there was no username and @<IP>
(without username) is a syntax error and it halted there.
My system is a plain old buster + cloudinit + docker I setup for this test.
I normally use the various 'cloud images' from the different vendors but haven't done so for debian. Did you build it manually? Which user gets the ssh key injected by cloud-init?
I'll build a viable image with buster
cloud image real quick and send over all the details..
OK, worked like a charm first try for me. Here's my script to create the image/template (alter to your liking, it simply has the baseline stuff I use for k8s scenarios):
#!/bin/bash
set -x
set -e
export IMGID=9007
export BASE_IMG="debian-10-openstack-amd64.qcow2"
export IMG="debian-10-openstack-amd64-${IMGID}.qcow2"
export STORAGEID="bitness-nfs"
if [ ! -f "${BASE_IMG}" ];then
wget https://cloud.debian.org/images/cloud/OpenStack/current-10/debian-10-openstack-amd64.qcow2
fi
if [ ! -f "${IMG}" ];then
cp -f "${BASE_IMG}" "${IMG}"
fi
# prepare mounts
guestmount -a ${IMG} -m /dev/sda1 /mnt/tmp/
mount --bind /dev/ /mnt/tmp/dev/
mount --bind /proc/ /mnt/tmp/proc/
# get resolving working
mv /mnt/tmp/etc/resolv.conf /mnt/tmp/etc/resolv.conf.orig
cp -a --force /etc/resolv.conf /mnt/tmp/etc/resolv.conf
# install desired apps
chroot /mnt/tmp /bin/bash -c "apt-get update"
chroot /mnt/tmp /bin/bash -c "DEBIAN_FRONTEND=noninteractive apt-get install -y net-tools curl qemu-guest-agent nfs-common open-iscsi lsscsi sg3-utils multipath-tools scsitools"
# https://www.electrictoolbox.com/sshd-hostname-lookups/
sed -i 's:#UseDNS no:UseDNS no:' /mnt/tmp/etc/ssh/sshd_config
sed -i '/package-update-upgrade-install/d' /mnt/tmp/etc/cloud/cloud.cfg
cat > /mnt/tmp/etc/cloud/cloud.cfg.d/99_custom.cfg << '__EOF__'
#cloud-config
# Install additional packages on first boot
#
# Default: none
#
# if packages are specified, this apt_update will be set to true
#
# packages may be supplied as a single package name or as a list
# with the format [<package>, <version>] wherein the specifc
# package version will be installed.
#packages:
# - qemu-guest-agent
# - nfs-common
ntp:
enabled: true
# datasource_list: [ NoCloud, ConfigDrive ]
__EOF__
cat > /mnt/tmp/etc/multipath.conf << '__EOF__'
defaults {
user_friendly_names yes
find_multipaths yes
}
__EOF__
# enable services
chroot /mnt/tmp systemctl enable open-iscsi.service || true
chroot /mnt/tmp systemctl enable multipath-tools.service || true
# restore systemd-resolved settings
mv /mnt/tmp/etc/resolv.conf.orig /mnt/tmp/etc/resolv.conf
# umount everything
umount /mnt/tmp/dev
umount /mnt/tmp/proc
umount /mnt/tmp
# create template
qm create ${IMGID} --memory 512 --net0 virtio,bridge=vmbr0
qm importdisk ${IMGID} ${IMG} ${STORAGEID} --format qcow2
qm set ${IMGID} --scsihw virtio-scsi-pci --scsi0 ${STORAGEID}:${IMGID}/vm-${IMGID}-disk-0.qcow2
qm set ${IMGID} --ide2 ${STORAGEID}:cloudinit
qm set ${IMGID} --boot c --bootdisk scsi0
qm set ${IMGID} --serial0 socket --vga serial0
qm template ${IMGID}
# set host cpu, ssh key, etc
qm set ${IMGID} --scsihw virtio-scsi-pci
qm set ${IMGID} --cpu host
qm set ${IMGID} --agent enabled=1
qm set ${IMGID} --autostart
qm set ${IMGID} --onboot 1
qm set ${IMGID} --ostype l26
qm set ${IMGID} --ipconfig0 "ip=dhcp"
After that, I launched the machine with:
docker-machine --debug create --driver proxmoxve --engine-install-url https://get.docker.com --proxmoxve-provision-strategy clone --proxmoxve-proxmox-host 172.29.2.1 --proxmoxve-proxmox-node cloud01 --proxmoxve-proxmox-user-name root --proxmoxve-proxmox-user-password password --proxmoxve-proxmox-realm pam --proxmoxve-vm-storage-size 20 --proxmoxve-vm-cpu-sockets 2 --proxmoxve-vm-cpu-cores 2 --proxmoxve-vm-memory 8 --proxmoxve-vm-storage-path '' --proxmoxve-vm-image-file bitness-nfs:iso/rancheros-proxmoxve-autoformat-v1.5.6.iso --proxmoxve-vm-clone-vmid 9007 --proxmoxve-vm-clone-full 2 --proxmoxve-vm-start-onboot 1 --proxmoxve-vm-protection 0 --proxmoxve-vm-citype nocloud --proxmoxve-ssh-username debian --proxmoxve-ssh-password '' --proxmoxve-debug-resty --proxmoxve-debug-driver docker-rancher
Some of the args above are irrelevant for clone
and/or optional generally...but that should get you going..
Thank you, that is also great as an example for the README.md
.
In the end I discovered what my problem was: I forgot to add the cloudinit drive to the VM I was cloning. Yet without providing the ssh-username
, it still was not able to run, so I threw away my container and went with your script. After changing nfs to ZFS, it worked out of the box and very, very fast.
Nice! I have similar scripts for all the major distros and even the rancher os image as well. I do need to clean them up a bit and could probably organize them a little better but they work. I’ll commit them all to a github repo and if we want we can point to that here or even just copy them and include if desired.
Nice! I have similar scripts for all the major distros and even the rancher os image as well. I do need to clean them up a bit and could probably organize them a little better but they work. I’ll commit them all to a github repo and if we want we can point to that here or even just copy them and include if desired.
@travisghansen - I wondered if you ever got the the chance to commit the scripts for other distros? I'm mainly interested in an Ubuntu one.
@benosman there are a few bits that are specific to my env, but it should be pretty easy to alter to your needs.
I use cloudinit to make templates off minimal cloud images: https://gist.github.com/nayrnet/066e3963397de02594d4963a9258e22f
unfortunately right now w/proxmox its either/or using a yaml or proxmox UI/API with cloudinit.. I've asked them to implement vendordata which would allow for both to be used together.. but it still works for creating templates, then remove the snipplet to switch it to proxmox cloudinit. https://bugzilla.proxmox.com/show_bug.cgi?id=2429#c5
Thanks @travisghansen and @nayrnet, I will try those out tomorrow.
@nayrnet: I like your suggestion of using the vendordata, I hope they take it up. Proxmox's cloudinit does seem quite limited compared to other hypervisors and cloud platforms.
This adds support for using clone+cloud-init images.
Basic requirement of the
template
are:Currently only 2 changed flags are needed to make it work:
--proxmoxve-provision-strategy clone
--proxmoxve-vm-clone-vmid <vmid>
I've not tested onFull clones tested.full
clones yet, my storage setup supports shallow cloning so I enjoy the space saving etc.There are some other minor updates/fixes/additions as well.