Closed olljanat closed 6 months ago
Not new bug but https://github.com/burmilla/os/issues/158#issuecomment-1689382337 should be fixed.
And because AppArmor is now enabled by default and https://github.com/moby/swarmkit/pull/3152 was recently merged it probably make sense to wait a bit to see if we can have user Docker version with that feature.
For me this 2.0 RC1 is not working it boots up and says has network but then nothing happens (no terminal). Not sure if it is because I have some ping and network mounts in the config. Booting to beta7 and it works again. Also logging in with ssh gives access denied so looks like not complete config is loaded.
Thx @olljanat. Forcing the console upgrade did work :) So maybe also update changelog that this is also need for beta users.
@toussii Haven't actually tested upgrade from beta but I would assume that it needs same trick to force console upgrade as it is mentioned in release notes when upgrading from 1.9.x
Afaiu it is that system-docker rename and changes related to it which caused need for this extra step.
I noticed that ssh-audit reports that our SSH hardednings https://github.com/burmilla/os/blob/615b3d4f7c4710580ba689e515082baa132548ea/images/02-console/sshd_config.append.tpl#L16-L23 are not good enough with today's standards.
Scan log: burmillaos-v2.0.0-rc1-ssh-audit.log
Ubuntu guide: https://www.ssh-audit.com/hardening_guides.html#ubuntu_20_04_lts
EDIT: In additionally it would be good to include needed packages for MFA support to console https://ubuntu.com/tutorials/configure-ssh-2fa#1-overview
EDIT2: Perhaps we should also look https://github.com/a13xp0p0v/kernel-hardening-checker
For some reason console contains now partx
but not fdisk
. That should be fixed to be backward compatible.
I still was not able to boot v2.0.0-rc1 on UEFI-enabled VM under Proxmox. My host installation is a bit outdated, so I will report again after updating it.
Unfortunately direct UEFI support (tracker issue #8) didn't make it 2.0 versions because of huge amount of refactoring + testing needed by it and lack of contributors. However, you still can build your own Proxmox image with UEFI support with those workaround scripts linked to that issue and use that to deploy multiple servers.
Mine is a VM from a proxmox Environment, Can i upgrade from the VM with no issue?
Issue 1, the instructions for upgrade should probably clarify the last command needs to be run later (after reboot?)
Issue 2, after restarting, running the final command and restarting again user land docker is still failing
root@caesium6:/home/rancher# system-docker logs 238e47103e76
time="2023-08-20T03:09:27Z" level=info msg="Starting Docker in context: console"
time="2023-08-20T03:09:27Z" level=error msg="failed to LoadFromNetwork: Not found. HTTP status code: 404"
time="2023-08-20T03:09:27Z" level=error msg="Not found. HTTP status code: 404"
time="2023-08-20T03:09:27Z" level=error msg="Failed to load rancher.docker.engine=(docker-20.10.23): Not found. HTTP status code: 404"
time="2023-08-20T03:09:27Z" level=info msg="Getting PID for service: console"
time="2023-08-20T03:09:27Z" level=info msg="console PID 999"
time="2023-08-20T03:09:28Z" level=info msg="[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=caesium6.alkaline.solutions HOME=/]"
time="2023-08-20T03:09:28Z" level=info msg="Running [system-docker-runc exec -- f6202c659c8d534e5259f70fb3a0460209459c1279b40dcd7aadb72702d3c32c env PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=x.example.com HOME=/ ros docker-init --log-opt max-file=2 --log-opt max-size=25m --group docker --host unix:///var/run/docker.sock --oom-score-adjust -250 --tlsverify --tlscacert=/etc/docker/tls/ca.pem --tlscert=/etc/docker/tls/server-cert.pem --tlskey=/etc/docker/tls/server-key.pem -H=0.0.0.0:2376]"
time="2023-08-20T03:09:28Z" level=info msg="Found /usr/bin/dockerd"
container_linux.go:262: starting container process caused "process_linux.go:339: container init caused \"rootfs_linux.go:57: mounting \\\"/usr/bin/system-docker-runc\\\" to rootfs \\\"/var/lib/system-docker/overlay2/28ce1718ba97fa688c9c47c8867607a5e18d1ed3a9b898e454085ace3abc8430/merged\\\" at \\\"/var/lib/system-docker/overlay2/28ce1718ba97fa688c9c47c8867607a5e18d1ed3a9b898e454085ace3abc8430/merged/usr/bin/system-docker-runc\\\" caused \\\"not a directory\\\"\""
container_linux.go:262: starting container process caused "process_linux.go:339: container init caused \"rootfs_linux.go:57: mounting \\\"/usr/bin/system-docker-runc\\\" to rootfs \\\"/var/lib/system-docker/overlay2/28ce1718ba97fa688c9c47c8867607a5e18d1ed3a9b898e454085ace3abc8430/merged\\\" at \\\"/var/lib/system-docker/overlay2/28ce1718ba97fa688c9c47c8867607a5e18d1ed3a9b898e454085ace3abc8430/merged/usr/bin/system-docker-runc\\\" caused \\\"not a directory\\\"\""
container_linux.go:262: starting container process caused "process_linux.go:339: container init caused \"rootfs_linux.go:57: mounting \\\"/usr/bin/system-docker-runc\\\" to rootfs \\\"/var/lib/system-docker/overlay2/28ce1718ba97fa688c9c47c8867607a5e18d1ed3a9b898e454085ace3abc8430/merged\\\" at \\\"/var/lib/system-docker/overlay2/28ce1718ba97fa688c9c47c8867607a5e18d1ed3a9b898e454085ace3abc8430/merged/usr/bin/system-docker-runc\\\" caused \\\"not a directory\\\"\""
@dwaite are you still able to boot to old version and share sudo ros config export
output?
Issue 1, the instructions for upgrade should probably clarify the last command needs to be run later (after reboot?)
Thanks, this is exactly type of issue which I want catch before marking it RTM. However we need a bit more in information about starting point and issue you see in here?
@dwaite are you still able to boot to old version and share
sudo ros config export
output?
Yep, exactly what I did, although oops didn't realize I needed to also force the console again to downgrade.
hostname: xx.example.com
rancher:
cloud_init:
datasources:
- digitalocean
console: default
docker:
engine: docker-20.10.23
tls: true
environment:
EXTRA_CMDLINE: /init
force_console_rebuild: false
network:
dns:
nameservers:
- 1.1.1.1
- 8.8.8.8
interfaces:
eth0:
addresses:
- 1.2.3.4/20
- 1.2.3.4/16
- 2600:0000:0000:0000:0000:0000:0000:0001/64
gateway: 1.2.3.1
gateway_ipv6: 2600:a880:0000:0000:0000:0000:0000:0001
ipv4ll: true
eth1:
addresses:
- 1.2.3.4/16
gateway: 1.2.3.1
resize_device: /dev/vda
services:
console:
labels:
- io.docker.compose.rebuild=always
- io.rancher.os.after=network
- io.rancher.os.console=default
- io.rancher.os.scope=system
services_include:
docker-compose: true
state:
dev: LABEL=RANCHER_STATE
wait: true
upgrade:
url: https://raw.githubusercontent.com/burmilla/releases/v2.0.x/releases.yml
ssh_authorized_keys:
- ecdsa-sha2-nistp256 AAAA label
- ssh-ed25519 AAAA label
Issue 1, the instructions for upgrade should probably clarify the last command needs to be run later (after reboot?)
Thanks, this is exactly type of issue which I want catch before marking it RTM. However we need a bit more in information about starting point and issue you see in here?
Ahh, when I see console commands I review then paste. After the second line the system processes then prompts for reboot, so I assume the third line needs to be set after reboot.
Using UTM
(MacOS virtual machines with qemu
) one has to deactivate "UEFI Boot" and use the "virtio-vga" emulated display card otherwise the 2.0.0-rc1 iso image will not boot.
@olljanat anything else I can do to help with the user docker issue on upgrade?
It is not technically hard to solve but more about the question how we want to do it. Check out voting in https://github.com/burmilla/os/issues/150#issuecomment-1826765404
Also, I noticed the login user name is rancher
and not burmilla
as noted here in the docs.
@gramian for clarification. This is issue tracker for new issues in 2.0.0-rc1 compared to 1.9.x versions.
UEFI mode is not supported, issue tracker in #8
Documentation is based on RancherOS together with some search&replace so there definitely is issues which are waiting for fixing. You can find documentation tagged issues from https://github.com/burmilla/os/issues?q=is%3Aopen+is%3Aissue+label%3Adocumentation and contribute to documentation on https://github.com/burmilla/burmilla.github.io
@olljanat I am sorry, I did not look close enough.
FYI, v2.0.0-rc2 is now released. I do not solve upgrade challenge yet but fixes https://github.com/burmilla/os/issues/161#issuecomment-1702226024 and https://github.com/burmilla/os/issues/161#issuecomment-1722436320
All changes are visible in https://github.com/burmilla/os/commit/8a9e14f887b46673101c18b94e6eb48709dc02a9
For os-docker on v2.0.0-rc2 (and probably earlier v2.0.0 branches), rancher.docker.graph has been deprecated since engine 23. See https://docs.docker.com/engine/deprecated/#-g-and---graph-flags-on-dockerd. If rancher.docker.graph is defined in cloud-config, then os-docker won't start, and error 125s out with unknown option --graph for dockerd. The option should now be named data-root, and add start option --data-root to dockerd. I can get os-docker to start by removing this option in cloud-config, but then my defined volumes, networks, images, and containers no longer appear.
Yeah so basically https://github.com/burmilla/os/blob/8a9e14f887b46673101c18b94e6eb48709dc02a9/config/schema.go#L135 and https://github.com/burmilla/os/blob/8a9e14f887b46673101c18b94e6eb48709dc02a9/config/types.go#L174 and needs to be updated.
Most likely in way that we keep support for graph
just pass that value to data-root
and same time add add support for new data_root
which does same. That why it would be backward compatible for any existing installations.
Pull request to implement that are welcome.
Workaround to this is use extra_args
and define --data-root
with it.
Workaround to this is use
extra_args
and define--data-root
with it.
Right, this is how I worked around this for os-docker. Note, system-docker also needs to be updated. Currently, it's just warning graph
is deprecated. Possibly bootstrap-docker is also affected.
Also, runcmd appears to be executed twice on boot.
Last update to this one. I see that there is some new people here which is nice. However, v2.0.0 release version is very late from original target #148 so for now we need just accept that all bugs cannot be fixed so I will now just handle those which are easy and skip others.
Please, create on issue for each bugs which you see in future and preferably with info if those are new bugs in v2.x versions or if they exists already in v1.9.x.
However some comments to what was discussed earlier:
Also, runcmd appears to be executed twice on boot.
I'm quite sure that it is old issue and not critical so will just skip it.
Note, system-docker also needs to be updated. Currently, it's just warning graph is deprecated.
True because graph have been deprecated for years but it does not matter because system-docker is stuck in customized version of 17.06 #28 Will update config parameter for user Docker.
When it comes to console upgrade bug. Renaming system-docker to system-engine which was part of v2.0.0-rc1 caused more issues than solved so will just downgrade back to old version.
Other why v2.0.0 should be same than rc2, just with newer packages.
Report all new issues seen with v2.0.0-rc1 version to here.