burmilla / os

Tiny Linux distro that runs the entire OS as Docker containers
https://burmillaos.org
Apache License 2.0
210 stars 13 forks source link

v2.0.0-rc1 issue tracker #161

Closed olljanat closed 6 months ago

olljanat commented 1 year ago

Report all new issues seen with v2.0.0-rc1 version to here.

olljanat commented 1 year ago

Not new bug but https://github.com/burmilla/os/issues/158#issuecomment-1689382337 should be fixed.

And because AppArmor is now enabled by default and https://github.com/moby/swarmkit/pull/3152 was recently merged it probably make sense to wait a bit to see if we can have user Docker version with that feature.

toussii commented 1 year ago

For me this 2.0 RC1 is not working it boots up and says has network but then nothing happens (no terminal). Not sure if it is because I have some ping and network mounts in the config. Booting to beta7 and it works again. Also logging in with ssh gives access denied so looks like not complete config is loaded.

Thx @olljanat. Forcing the console upgrade did work :) So maybe also update changelog that this is also need for beta users.

olljanat commented 1 year ago

@toussii Haven't actually tested upgrade from beta but I would assume that it needs same trick to force console upgrade as it is mentioned in release notes when upgrading from 1.9.x

Afaiu it is that system-docker rename and changes related to it which caused need for this extra step.

olljanat commented 1 year ago

I noticed that ssh-audit reports that our SSH hardednings https://github.com/burmilla/os/blob/615b3d4f7c4710580ba689e515082baa132548ea/images/02-console/sshd_config.append.tpl#L16-L23 are not good enough with today's standards.

Scan log: burmillaos-v2.0.0-rc1-ssh-audit.log

Ubuntu guide: https://www.ssh-audit.com/hardening_guides.html#ubuntu_20_04_lts

EDIT: In additionally it would be good to include needed packages for MFA support to console https://ubuntu.com/tutorials/configure-ssh-2fa#1-overview

EDIT2: Perhaps we should also look https://github.com/a13xp0p0v/kernel-hardening-checker

olljanat commented 12 months ago

For some reason console contains now partx but not fdisk. That should be fixed to be backward compatible.

tpimh commented 10 months ago

I still was not able to boot v2.0.0-rc1 on UEFI-enabled VM under Proxmox. My host installation is a bit outdated, so I will report again after updating it.

olljanat commented 10 months ago

Unfortunately direct UEFI support (tracker issue #8) didn't make it 2.0 versions because of huge amount of refactoring + testing needed by it and lack of contributors. However, you still can build your own Proxmox image with UEFI support with those workaround scripts linked to that issue and use that to deploy multiple servers.

afidegnum commented 10 months ago

Mine is a VM from a proxmox Environment, Can i upgrade from the VM with no issue?

dwaite commented 10 months ago

Issue 1, the instructions for upgrade should probably clarify the last command needs to be run later (after reboot?)

dwaite commented 10 months ago

Issue 2, after restarting, running the final command and restarting again user land docker is still failing

root@caesium6:/home/rancher# system-docker logs 238e47103e76
time="2023-08-20T03:09:27Z" level=info msg="Starting Docker in context: console"
time="2023-08-20T03:09:27Z" level=error msg="failed to LoadFromNetwork: Not found. HTTP status code: 404"
time="2023-08-20T03:09:27Z" level=error msg="Not found. HTTP status code: 404"
time="2023-08-20T03:09:27Z" level=error msg="Failed to load rancher.docker.engine=(docker-20.10.23): Not found. HTTP status code: 404"
time="2023-08-20T03:09:27Z" level=info msg="Getting PID for service: console"
time="2023-08-20T03:09:27Z" level=info msg="console PID 999"
time="2023-08-20T03:09:28Z" level=info msg="[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=caesium6.alkaline.solutions HOME=/]"
time="2023-08-20T03:09:28Z" level=info msg="Running [system-docker-runc exec -- f6202c659c8d534e5259f70fb3a0460209459c1279b40dcd7aadb72702d3c32c env PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=x.example.com HOME=/ ros docker-init --log-opt max-file=2 --log-opt max-size=25m --group docker --host unix:///var/run/docker.sock --oom-score-adjust -250 --tlsverify --tlscacert=/etc/docker/tls/ca.pem --tlscert=/etc/docker/tls/server-cert.pem --tlskey=/etc/docker/tls/server-key.pem -H=0.0.0.0:2376]"
time="2023-08-20T03:09:28Z" level=info msg="Found /usr/bin/dockerd"
container_linux.go:262: starting container process caused "process_linux.go:339: container init caused \"rootfs_linux.go:57: mounting \\\"/usr/bin/system-docker-runc\\\" to rootfs \\\"/var/lib/system-docker/overlay2/28ce1718ba97fa688c9c47c8867607a5e18d1ed3a9b898e454085ace3abc8430/merged\\\" at \\\"/var/lib/system-docker/overlay2/28ce1718ba97fa688c9c47c8867607a5e18d1ed3a9b898e454085ace3abc8430/merged/usr/bin/system-docker-runc\\\" caused \\\"not a directory\\\"\""
container_linux.go:262: starting container process caused "process_linux.go:339: container init caused \"rootfs_linux.go:57: mounting \\\"/usr/bin/system-docker-runc\\\" to rootfs \\\"/var/lib/system-docker/overlay2/28ce1718ba97fa688c9c47c8867607a5e18d1ed3a9b898e454085ace3abc8430/merged\\\" at \\\"/var/lib/system-docker/overlay2/28ce1718ba97fa688c9c47c8867607a5e18d1ed3a9b898e454085ace3abc8430/merged/usr/bin/system-docker-runc\\\" caused \\\"not a directory\\\"\""
container_linux.go:262: starting container process caused "process_linux.go:339: container init caused \"rootfs_linux.go:57: mounting \\\"/usr/bin/system-docker-runc\\\" to rootfs \\\"/var/lib/system-docker/overlay2/28ce1718ba97fa688c9c47c8867607a5e18d1ed3a9b898e454085ace3abc8430/merged\\\" at \\\"/var/lib/system-docker/overlay2/28ce1718ba97fa688c9c47c8867607a5e18d1ed3a9b898e454085ace3abc8430/merged/usr/bin/system-docker-runc\\\" caused \\\"not a directory\\\"\""
olljanat commented 10 months ago

@dwaite are you still able to boot to old version and share sudo ros config export output?

Issue 1, the instructions for upgrade should probably clarify the last command needs to be run later (after reboot?)

Thanks, this is exactly type of issue which I want catch before marking it RTM. However we need a bit more in information about starting point and issue you see in here?

dwaite commented 10 months ago

@dwaite are you still able to boot to old version and share sudo ros config export output?

Yep, exactly what I did, although oops didn't realize I needed to also force the console again to downgrade.

hostname: xx.example.com
rancher:
  cloud_init:
    datasources:
    - digitalocean
  console: default
  docker:
    engine: docker-20.10.23
    tls: true
  environment:
    EXTRA_CMDLINE: /init
  force_console_rebuild: false
  network:
    dns:
      nameservers:
      - 1.1.1.1
      - 8.8.8.8
    interfaces:
      eth0:
        addresses:
        - 1.2.3.4/20
        - 1.2.3.4/16
        - 2600:0000:0000:0000:0000:0000:0000:0001/64
        gateway: 1.2.3.1
        gateway_ipv6: 2600:a880:0000:0000:0000:0000:0000:0001
        ipv4ll: true
      eth1:
        addresses:
        - 1.2.3.4/16
        gateway: 1.2.3.1
  resize_device: /dev/vda
  services:
    console:
      labels:
      - io.docker.compose.rebuild=always
      - io.rancher.os.after=network
      - io.rancher.os.console=default
      - io.rancher.os.scope=system
  services_include:
    docker-compose: true
  state:
    dev: LABEL=RANCHER_STATE
    wait: true
  upgrade:
    url: https://raw.githubusercontent.com/burmilla/releases/v2.0.x/releases.yml
ssh_authorized_keys:
- ecdsa-sha2-nistp256 AAAA label
- ssh-ed25519 AAAA label

Issue 1, the instructions for upgrade should probably clarify the last command needs to be run later (after reboot?)

Thanks, this is exactly type of issue which I want catch before marking it RTM. However we need a bit more in information about starting point and issue you see in here?

Ahh, when I see console commands I review then paste. After the second line the system processes then prompts for reboot, so I assume the third line needs to be set after reboot.

gramian commented 9 months ago

Using UTM (MacOS virtual machines with qemu) one has to deactivate "UEFI Boot" and use the "virtio-vga" emulated display card otherwise the 2.0.0-rc1 iso image will not boot.

dwaite commented 9 months ago

@olljanat anything else I can do to help with the user docker issue on upgrade?

olljanat commented 9 months ago

It is not technically hard to solve but more about the question how we want to do it. Check out voting in https://github.com/burmilla/os/issues/150#issuecomment-1826765404

gramian commented 9 months ago

Also, I noticed the login user name is rancher and not burmilla as noted here in the docs.

olljanat commented 9 months ago

@gramian for clarification. This is issue tracker for new issues in 2.0.0-rc1 compared to 1.9.x versions.

UEFI mode is not supported, issue tracker in #8

Documentation is based on RancherOS together with some search&replace so there definitely is issues which are waiting for fixing. You can find documentation tagged issues from https://github.com/burmilla/os/issues?q=is%3Aopen+is%3Aissue+label%3Adocumentation and contribute to documentation on https://github.com/burmilla/burmilla.github.io

gramian commented 9 months ago

@olljanat I am sorry, I did not look close enough.

olljanat commented 9 months ago

FYI, v2.0.0-rc2 is now released. I do not solve upgrade challenge yet but fixes https://github.com/burmilla/os/issues/161#issuecomment-1702226024 and https://github.com/burmilla/os/issues/161#issuecomment-1722436320

All changes are visible in https://github.com/burmilla/os/commit/8a9e14f887b46673101c18b94e6eb48709dc02a9

tai-ss commented 7 months ago

For os-docker on v2.0.0-rc2 (and probably earlier v2.0.0 branches), rancher.docker.graph has been deprecated since engine 23. See https://docs.docker.com/engine/deprecated/#-g-and---graph-flags-on-dockerd. If rancher.docker.graph is defined in cloud-config, then os-docker won't start, and error 125s out with unknown option --graph for dockerd. The option should now be named data-root, and add start option --data-root to dockerd. I can get os-docker to start by removing this option in cloud-config, but then my defined volumes, networks, images, and containers no longer appear.

olljanat commented 7 months ago

Yeah so basically https://github.com/burmilla/os/blob/8a9e14f887b46673101c18b94e6eb48709dc02a9/config/schema.go#L135 and https://github.com/burmilla/os/blob/8a9e14f887b46673101c18b94e6eb48709dc02a9/config/types.go#L174 and needs to be updated.

Most likely in way that we keep support for graph just pass that value to data-root and same time add add support for new data_root which does same. That why it would be backward compatible for any existing installations. Pull request to implement that are welcome.

Workaround to this is use extra_args and define --data-root with it.

tai-ss commented 7 months ago

Workaround to this is use extra_args and define --data-root with it.

Right, this is how I worked around this for os-docker. Note, system-docker also needs to be updated. Currently, it's just warning graph is deprecated. Possibly bootstrap-docker is also affected.

tai-ss commented 7 months ago

Also, runcmd appears to be executed twice on boot.

olljanat commented 6 months ago

Last update to this one. I see that there is some new people here which is nice. However, v2.0.0 release version is very late from original target #148 so for now we need just accept that all bugs cannot be fixed so I will now just handle those which are easy and skip others.

Please, create on issue for each bugs which you see in future and preferably with info if those are new bugs in v2.x versions or if they exists already in v1.9.x.

However some comments to what was discussed earlier:

Also, runcmd appears to be executed twice on boot.

I'm quite sure that it is old issue and not critical so will just skip it.

Note, system-docker also needs to be updated. Currently, it's just warning graph is deprecated.

True because graph have been deprecated for years but it does not matter because system-docker is stuck in customized version of 17.06 #28 Will update config parameter for user Docker.

When it comes to console upgrade bug. Renaming system-docker to system-engine which was part of v2.0.0-rc1 caused more issues than solved so will just downgrade back to old version.

Other why v2.0.0 should be same than rc2, just with newer packages.