cbsd / reggae

Reggae - DevOps tool for CBSD
BSD 2-Clause "Simplified" License
44 stars 12 forks source link

Install errors during master-init step #266

Closed yggdr closed 1 year ago

yggdr commented 1 year ago

pkg installed reggae (0.2.6) on 13.1-p5, then followed the readme: network-init, pflog restart, pf restart, cbsd-init, but then master-init throws various errors:

retrieve base.txz from download.freebsd.org, size: 186m
/usr/cbsd/tmp/src.99477/base.txz                       186 MB 8023 kBps    23s

Extracting base...
./usr/lib/librt.so.1: Can't unlink already-existing object: Operation not permitted
./usr/bin/chpass: Can't unlink already-existing object: Operation not permitted
./usr/bin/login: Can't unlink already-existing object: Operation not permitted
./usr/bin/opiepasswd: Can't unlink already-existing object: Operation not permitted
./usr/bin/opieinfo: Can't unlink already-existing object: Operation not permitted
./usr/bin/su: Can't unlink already-existing object: Operation not permitted
./usr/bin/passwd: Can't unlink already-existing object: Operation not permitted
./usr/bin/crontab: Can't unlink already-existing object: Operation not permitted
./lib/libthr.so.3: Can't unlink already-existing object: Operation not permitted
./lib/libc.so.7: Can't unlink already-existing object: Operation not permitted
./lib/libcrypt.so.5: Can't unlink already-existing object: Operation not permitted
./libexec/ld-elf.so.1: Can't unlink already-existing object: Operation not permitted
./libexec/ld-elf32.so.1: Can't unlink already-existing object: Operation not permitted
./sbin/init: Can't unlink already-existing object: Operation not permitted
./var/empty/: Can't restore time: Operation not permitted
tar: Error exit delayed from previous errors.
Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:13:amd64/quarterly, please wait...
Installing pkg-1.18.4...

[...]

Sat Jan  7 17:50:22 CET 2023
Execute script: ipv6.sh
add net default: gateway fd10:6c79:8ae5:8b91::1
isc-dhcpd6 does not exist in /etc/rc.d or the local startup
directories (/usr/local/etc/rc.d), or is not executable
Execute master script: reggae.sh
 :: /usr/cbsd/jails-system/network/master_poststart.d/reggae.sh
reggae not running? (check /var/run/reggae.pid).
Starting reggae ... done
reggae_pf not running?
/etc/pf.conf.d is not directory
jstart done in 8 seconds
Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:13:amd64/latest, please wait...
[network.schukraft.it] Installing pkg-1.19.0...
the most recent version of pkg-1.18.4 is already installed
Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:13:amd64/latest, please wait...
[network.schukraft.it] Installing pkg-1.19.0...
the most recent version of pkg-1.18.4 is already installed
/usr/local/share/reggae/scripts/master-init.sh: cannot create /usr/cbsd/jails-data/network-data/usr/local/etc/sudoers.d/reggae: No such file or directory
cp: directory /usr/cbsd/jails-data/network-data/usr/local/bin does not exist
chmod: /usr/cbsd/jails-data/network-data/usr/local/bin/dhcpd-hook.sh: No such file or directory
cp: directory /usr/cbsd/jails-data/network-data/usr/local/bin does not exist
chmod: /usr/cbsd/jails-data/network-data/usr/local/bin/reggae-register.sh: No such file or directory
/usr/local/share/reggae/scripts/master-init.sh: cannot create /usr/cbsd/jails-data/network-data/usr/local/bin/ip-by-mac.sh: No such file or directory
chmod: /usr/cbsd/jails-data/network-data/usr/local/bin/ip-by-mac.sh: No such file or directory
pw: unknown group `nsd'
isc-dhcpd does not exist in /etc/rc.d or the local startup
directories (/usr/local/etc/rc.d), or is not executable
isc-dhcpd6 does not exist in /etc/rc.d or the local startup
directories (/usr/local/etc/rc.d), or is not executable
/bin/sh: nsd-control-setup: not found
nsd does not exist in /etc/rc.d or the local startup
directories (/usr/local/etc/rc.d), or is not executable
/bin/sh: /usr/local/bin/reggae-register.sh: not found
/bin/sh: /usr/local/bin/reggae-register.sh: not found
/bin/sh: /usr/local/bin/reggae-register.sh: not found
/bin/sh: /usr/local/bin/reggae-register.sh: not found
olevole commented 1 year ago

Base extracted on 'jcreate' step:

./usr/lib/librt.so.1: Can't unlink already-existing object: Operation not permitted

messages tell us that the files have begun to be deleted. This may be due to some kind of error when unpacking the base.

@yggdr can you try to get base by hand ( to see why it didn't work? )

cbsd repo action=get sources=base

(of base exist in system, try to remove it first: cbsd removebase )

yggdr commented 1 year ago

Getting it by hand seems to work:

cbsd repo action=get sources=base
Bases registered: /usr/cbsd/basejail/base_amd64_amd64_13.1
register_base: auto_baseupdate=0 via FreeBSD-bases.conf, updates disabled
register_base: you might want to do cbsd baseupdate by hand to fetch latest patches
Scanning for fastest mirror...
             Mirror source:                bytes/sec:
 * [ 1/3   ] http://ftp.freebsd.org:       4987562
 * [ 2/3   ] https://download.freebsd.org: 872448
 * [ 3/3   ] https://pub.allbsd.org:       0
 Winner: http://ftp.freebsd.org/pub/FreeBSD/releases/amd64/amd64/13.1-RELEASE/base.txz
Looking for official FreeBSD mirror:
retrieve base.txz from ftp.freebsd.org, size: 186m
/usr/cbsd/tmp/src.32787/base.txz                       186 MB 8485 kBps    22s

Extracting base...
retrieve lib32.txz from download.freebsd.org, size: 64m
/usr/cbsd/tmp/src.32787/lib32.txz                       63 MB 7068 kBps    09s

Extracting lib32...
Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:13:amd64/quarterly, please wait...
Installing pkg-1.18.4...
package pkg is already installed, forced install
Extracting pkg-1.18.4: 100%
Updating FreeBSD repository catalogue...
Fetching meta.conf: 100%    163 B   0.2kB/s    00:01
Fetching packagesite.pkg: 100%    6 MiB   2.2MB/s    00:03
Processing entries: 100%
FreeBSD repository update completed. 32285 packages processed.
All repositories are up to date.
Bases updated: /usr/cbsd/basejail/base_amd64_amd64_13.1
register_base: auto_baseupdate=0 via FreeBSD-bases.conf, updates disabled
register_base: you might want to do cbsd baseupdate by hand to fetch latest patches
etcupdate: extract to: /usr/cbsd/src/src_13.1/etcupdate/current (by: /usr/local/cbsd/share/etcupdate_13.txt.xz <- /usr/cbsd/basejail/base_amd64_amd64_13.1)
etcupdate: build to: /usr/cbsd/src/src_13.1/etcupdate/etcupdate.tgz (from: /usr/cbsd/src/src_13.1/etcupdate/current)
Done...

I then tried to cbsd jdestroy the existing network-jail and run master-init again, but the script failed with the same errors as above after the [...] separator.

olevole commented 1 year ago

@mekanix By the way, how about re-making the creation of the jail to CBSDfile here ?

Also we can check before 'jcreate` existance of base and get it in a separate step before creating.

Something like this: https://github.com/cbsd/reggae/pull/267/files ( I'm not sure that Reggae can't use a Linux containers, so I did a check on the presence of a bin/ dir, not a /bin/sh - some Linux environment don't have a /bin/sh as shell, e.g. Alpine. In this case CBSD check several files: https://github.com/cbsd/cbsd/blob/v13.1.20/sudoexec/jlogin#L108-L114 )

yggdr commented 1 year ago

So is master-init broken right now? Because the system itself isn't far from a fresh vanilla install.

olevole commented 1 year ago

Not sure if the master-init script is broken. At least in the latest version ( 0.2.7 ).

I wrote a small test for a CBSD/bhyve-based test lab: https://github.com/olevole/reggae-init-test and it is successful ( cbsd up ).

When creating the 'network' container, there was some error when unpacking the FreeBSD base (base.txz) , but without playback it is difficult to say whats wrong. Maybe there was a double session and the base was unpacked in parallel with another process?

yggdr commented 1 year ago

There was nothing else running while I tried the install and setup, so I don't think some other process could have interfered. I tried to master-init again with xtrace like you did in your script, giving me this:

 env 'NOINTER=1' reggae master-init
Please wait: this will take a while...
Applying skel dir template from: /usr/cbsd/share/FreeBSD-jail-reggae-skel

To edit VM properties use: cbsd jconfig jname=network
To start VM use: cbsd jstart network
To stop VM use: cbsd jstop network
To remove VM use: cbsd jremove network
For attach VM console use: cbsd jlogin network

Creating network complete: Enjoy!
jcreate done in 7 seconds
mkdir: /var/run/reggae: File exists
b_order: 0
create epair: epair1:igb0
Execute master script: register.sh
 :: /usr/cbsd/jails-system/network/master_prestart.d/register.sh
Default NIC automatically selected: cbsd0
set resource limit: [ ]
jail renice: 1
Starting jail: network, parallel timeout=5
network: created
eth0
late_start in progress...
ELF ldconfig path: /lib /usr/lib /usr/lib/compat
32-bit compatibility ldconfig path: /usr/lib32
Starting Network: lo0 eth0.
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
eth0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 00:a0:98:83:e6:af
        hwaddr 02:61:cd:cc:72:0b
        inet 172.16.0.253 netmask 0xffffff00 broadcast 172.16.0.255
        groups: epair
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
add host 127.0.0.1: gateway lo0 fib 0: route already in table
add net default: gateway 172.16.0.254
add host ::1: gateway lo0 fib 0: route already in table
add net fe80::: gateway ::1
add net ff02::: gateway ::1
add net ::ffff:0.0.0.0: gateway ::1
add net ::0.0.0.0: gateway ::1
Clearing /tmp (X related).
Creating and/or trimming log files.
Updating motd:.
Updating /var/run/os-release done.
Starting syslogd.
Generating RSA host key.
3072 SHA256:i9ge9WBb3HLQ99Zx5IoHKoOTyElkTVhAPed+8G9eZqI root@network.schukraft.it (RSA)
Generating ECDSA host key.
256 SHA256:yVC9GKv92Hq8Wd2DlXSGbNeeO6CG7YFO+OREj0trqCs root@network.schukraft.it (ECDSA)
Generating ED25519 host key.
256 SHA256:rqm7e8rhvG/hwHoPrUc2kBbHAFzK0azsPjCb0RPOL/A root@network.schukraft.it (ED25519)
Performing sanity check on sshd configuration.
Starting sshd.
Starting cron.

Sat Jan  7 21:08:55 CET 2023
Execute script: ipv6.sh
add net default: gateway fd10:6c79:8ae5:8b91::1
isc-dhcpd6 does not exist in /etc/rc.d or the local startup
directories (/usr/local/etc/rc.d), or is not executable
Execute master script: reggae.sh
 :: /usr/cbsd/jails-system/network/master_poststart.d/reggae.sh
Stopping reggae.
Waiting for PIDS: 58893.
Starting reggae ... done
reggae_pf not running?
/etc/pf.conf.d is not directory
jstart done in 7 seconds
Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:13:amd64/latest, please wait...
[network.schukraft.it] Installing pkg-1.19.0...
the most recent version of pkg-1.18.4 is already installed
Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:13:amd64/latest, please wait...
[network.schukraft.it] Installing pkg-1.19.0...
the most recent version of pkg-1.18.4 is already installed
/usr/local/share/reggae/scripts/master-init.sh: cannot create /usr/cbsd/jails-data/network-data/usr/local/etc/sudoers.d/reggae: No such file or directory
cp: directory /usr/cbsd/jails-data/network-data/usr/local/bin does not exist
chmod: /usr/cbsd/jails-data/network-data/usr/local/bin/dhcpd-hook.sh: No such file or directory
cp: directory /usr/cbsd/jails-data/network-data/usr/local/bin does not exist
chmod: /usr/cbsd/jails-data/network-data/usr/local/bin/reggae-register.sh: No such file or directory
/usr/local/share/reggae/scripts/master-init.sh: cannot create /usr/cbsd/jails-data/network-data/usr/local/bin/ip-by-mac.sh: No such file or directory
chmod: /usr/cbsd/jails-data/network-data/usr/local/bin/ip-by-mac.sh: No such file or directory
pw: unknown group `nsd'
isc-dhcpd does not exist in /etc/rc.d or the local startup
directories (/usr/local/etc/rc.d), or is not executable
isc-dhcpd6 does not exist in /etc/rc.d or the local startup
directories (/usr/local/etc/rc.d), or is not executable
/bin/sh: nsd-control-setup: not found
nsd does not exist in /etc/rc.d or the local startup
directories (/usr/local/etc/rc.d), or is not executable
/bin/sh: /usr/local/bin/reggae-register.sh: not found
/bin/sh: /usr/local/bin/reggae-register.sh: not found
/bin/sh: /usr/local/bin/reggae-register.sh: not found
/bin/sh: /usr/local/bin/reggae-register.sh: not found
+ set +o xtrace

edit Seems version 0.2.6 from FreeBSDs package repo is the culprit. I installed 0.2.7 from the repo, and that runs through without major problems (at least the network-jail part, I didn't make it download and unpack base again). Now there seems to be some network connectivity problem in the testjail I setup.

mekanix commented 1 year ago

As I'm about to setup new desktop in the following days, I will go through initialization process at least once more and check it from the ground up.

olevole commented 1 year ago

As I'm about to setup new desktop in the following days, I will go through initialization process at least once more and check it from the ground up.

checkout https://github.com/olevole/reggae-init-test

this might be a good start for a CI/regression test: single CBSDfile for multi-tests (ZFS/UFS)

yggdr commented 1 year ago

I'll probably going to setup the server fresh in the next few days, because there continues to be some issue with jail networking, and I want to make sure it's not some version mismatch or the like. Are you usually installing cbsd and reggae both from the git repo directly instead of ports/packages?

mekanix commented 1 year ago

I try to keep everything "clean" by using packages, but sometimes there's a bug that needs to be updated that can not wait.

yggdr commented 1 year ago

Update: Having just reinstalled the server, I got cbsd and reggae to install (the only strange thing I caught was the error that /etc/pf.conf.d does not exist, and it also does not get created; is that directory used by reggae?). So the errors above might have been due to some strangeness on the old system, even though I wouldn't know what that might've been. But maybe it still helps hardening the setup-scripts somewhat?

Strangely, when now testing a new service (mkdir regtest; cd regtest; reggae init; make) the jail gets created (and its networking set up properly), but make stops when bootstrapping pkg:

jail regtest already exist, use autorestart=1 for updating via jset
Jail regtest already running, jid: 5
[regtest.amazeroth.schukraft.it] Installing pkg-1.19.0...
the most recent version of pkg-1.19.0 is already installed
Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:13:amd64/latest, please wait...
[regtest.amazeroth.schukraft.it] Installing pkg-1.19.0...
the most recent version of pkg-1.19.0 is already installed
Bootstrapping pkg from pkg+http://pkg.FreeBSD.org/FreeBSD:13:amd64/latest, please wait...
*** Error code 1

Stop.
make: stopped in /root/regtest

When logging in to the jail, pkg is not bootstrapped.

mekanix commented 1 year ago

There is something fishy in Reggae *-init commands and I'm working with @olevole on it. The pf.conf.d is not necessary and I will document how it all works once we fix initialization, first.

yggdr commented 1 year ago

Thanks. Any way I can help with that?

mekanix commented 1 year ago

Not yet, but once we have a patch the testing will be crucial. I'll ping you once we're ready for testing.

mekanix commented 1 year ago

One of the errors was that dhcpcd fails to acquire an address so I switched the default to dhclient/rtsol. The initialization process is changed, so I need to write the docs to cover it.

mekanix commented 1 year ago

I always get ifconfig: ioctl SIOCSIFNAME (set name): File exists and I'm not sure why. I'm suspecting it's about renaming epair to eth0, but I'm not sure. @olevole I experimented a bit with renaming interfaces and I've seen I can rename interface inside the jail so no locking around it is needed. Do you think CBSD can adopt the same strategy?

mekanix commented 1 year ago

The error is caused by one of my experiments leaving eth0 dangling on the host, so CBSD works as expected. The question about renaming interface inside the jail remains, thought, as situations like this could be improved by it.

The 0.3 branch changed commands for initialization so please refer to reggae(8) man page on how to initialize it once 0.3.7 is in ports. As it was decided that cbsd.io will no longer be maintained, info there is wrong.

olevole commented 1 year ago

Renaming interfaces in FreeBSD is not well done because inside the kernel the names stay the same and it's not obvious to the user. I filed a PR/bug demonstrating this: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235920