osism / cloud-in-a-box

Cloud in a box
https://osism.github.io/docs/guides/deploy-guide/examples/cloud-in-a-box
Apache License 2.0
18 stars 4 forks source link

CiaB 7.0.1: bootstrap fails on traefik #261

Open garloff opened 5 months ago

garloff commented 5 months ago
cd /opt/configuration/environments/manager
./run.sh traefik
[...]

TASK [osism.services.traefik : Create traefik external network] ************************************************************************
ok: [manager.systems.in-a-box.cloud]

TASK [osism.services.traefik : Copy docker-compose.yml file] ***************************************************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ansible.errors.AnsibleUndefinedVariable: {{ hostvars[inventory_hostname]['ansible_' + internal_interface]['ipv4']['address'] }}: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_vlan100'. 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_vlan100'. {{ hostvars[inventory_hostname]['ansible_' + internal_interface]['ipv4']['address'] }}: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_vlan100'. 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_vlan100'
fatal: [manager.systems.in-a-box.cloud]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: {{ hostvars[inventory_hostname]['ansible_' + internal_interface]['ipv4']['address'] }}: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_vlan100'. 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_vlan100'. {{ hostvars[inventory_hostname]['ansible_' + internal_interface]['ipv4']['address'] }}: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_vlan100'. 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_vlan100'"}

PLAY RECAP *****************************************************************************************************************************
manager.systems.in-a-box.cloud : ok=8    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   
garloff commented 5 months ago

Is something wrong with the network (netplan) config? If I add ip link add link eno1np0 name vlan100 type vlan id 100; netplan apply, this script starts to work.

garloff commented 5 months ago

And ip link add link eno1np0 name vlan101 type vlan id 101 is needed as well. bootstrap.sh && deploy.sh then succeeds.

garloff commented 5 months ago

And this somehow needs to be persisted, so the system survives a reboot.

garloff commented 5 months ago

I wonder if this is a netplan bug. The settings in 01-osism.yaml look correct to me and according to the docs, I would have expected netplan to bring up these vlan ifaces.

garloff commented 5 months ago

I could not find a way how to tell netplan to bring up the vlan links, so I injected a custom systemd unit into the boot process. This is certainly not the intended way to do things.

dragon@manager:~$ cat /etc/rc.network 
#!/bin/bash
if test "$1" = "start" -o "$1" = "restart"; then
        ip link add link eno1np0 name vlan100 type vlan id 100
        ip link add link eno1np0 name vlan101 type vlan id 101
        netplan apply
fi
dragon@manager:~$ cat /etc/systemd/system/rc-network.service 
# SPDX-License-Identifier: LGPL-2.1-or-later
# Injected to work around missing vlan links in netplan
# (c) Kurt Garloff <garloff@osb-alliance.com>, 4/2024
[Unit]
Description=/etc/rc.network setup
Documentation=man:systemd-rc-local-generator(8)
ConditionFileIsExecutable=/etc/rc.network
After=network-pre.target

[Service]
Type=forking
ExecStart=/etc/rc.network start
TimeoutSec=0
RemainAfterExit=yes
GuessMainPID=no

[Install]
WantedBy=multi-user.target
berendt commented 5 months ago

Please paste /etc/netplan/01-osism.yaml.

garloff commented 5 months ago

Looks correct to me ...

# This file describes the network interfaces available on your system
# For more information, see netplan(5).
---
network:
  version: 2
  renderer: networkd

  bonds:
    {}

  bridges:
    {}

  ethernets:
    eno1np0:
        dhcp4: true

  tunnels:
    {}

  vlans:
    vlan100:
        addresses:
        - 192.168.16.10/24
        id: 100
        link: eno1np0
    vlan101:
        id: 101
        link: eno1np0

  vrfs:
    {}
berendt commented 3 months ago

This was probably fixed with https://github.com/osism/ansible-collection-commons/pull/637.