checkpoint-restore / criu

Checkpoint/Restore tool
criu.org
Other
2.77k stars 561 forks source link

Alpine test/build fails #2313

Closed avagin closed 5 months ago

avagin commented 7 months ago
#10 7.842   CC       criu/pie/restorer.o
#10 7.926 criu/pie/restorer.c: In function 'restore_rlims':
#10 7.926 criu/pie/restorer.c:453:43: error: invalid use of undefined type 'struct rlimit64'
#10 7.926   453 |                 krlim.rlim_cur = ta->rlims[r].rlim_cur;
#10 7.926       |                                           ^
#10 7.926 criu/pie/restorer.c:453:46: error: invalid use of undefined type 'struct rlimit64'
#10 7.926   453 |                 krlim.rlim_cur = ta->rlims[r].rlim_cur;
#10 7.926       |                                              ^
#10 7.927 criu/pie/restorer.c:454:43: error: invalid use of undefined type 'struct rlimit64'
#10 7.927   454 |                 krlim.rlim_max = ta->rlims[r].rlim_max;
#10 7.927       |                                           ^
#10 7.927 criu/pie/restorer.c:454:46: error: invalid use of undefined type 'struct rlimit64'
#10 7.927   454 |                 krlim.rlim_max = ta->rlims[r].rlim_max;
#10 7.927       |                                              ^
#10 7.969 At top level:
#10 7.969 cc1: note: unrecognized command-line option '-Wno-unknown-warning-option' may have been intended to silence earlier diagnostics
#10 7.970 make[2]: *** [/criu/scripts/nmk/scripts/build.mk:118: criu/pie/restorer.o] Error 1
#10 7.970 make[2]: *** Waiting for unfinished jobs....
#10 8.040 make[1]: *** [criu/Makefile:59: pie] Error 2
#10 8.041 make: *** [Makefile:276: criu] Error 2
#10 ERROR: process "/bin/sh -c make mrproper && date && make -j $(nproc) CC=\"$CC\" && date" did not complete successfully: exit code: 2

https://github.com/checkpoint-restore/criu/actions/runs/7171767723/job/19527463844

avagin commented 7 months ago

It was probably introduced by https://github.com/guidosarducci/musl/commit/25e6fee27f4a293728dd15b659170e7b9c7db9bc Cc: @richfelker

avagin commented 6 months ago
2023-12-28T07:09:22.4352258Z =================== Run zdtm/static/netns_lock_iptables in h ===================
2023-12-28T07:09:22.4353045Z Start test
2023-12-28T07:09:22.4353418Z Test is SUID
2023-12-28T07:09:22.4354358Z ./netns_lock_iptables --pidfile=netns_lock_iptables.pid --outfile=netns_lock_iptables.out
2023-12-28T07:09:22.4355385Z Timeout when trying to connect to server
2023-12-28T07:09:22.4356261Z Running zdtm/static/netns_lock_iptables.hook(--post-start)
2023-12-28T07:09:22.4357249Z Running zdtm/static/netns_lock_iptables.hook(--pre-dump)
2023-12-28T07:09:22.4357950Z Run criu dump
2023-12-28T07:09:22.4358513Z =[log]=> dump/zdtm/static/netns_lock_iptables/56/1/dump.log
2023-12-28T07:09:22.4359682Z ------------------------ grep Error ------------------------
2023-12-28T07:09:22.4360513Z b'(00.008185) net: \tRunning ip -6 route save'
2023-12-28T07:09:22.4361278Z b'(00.008796) net: \tRunning ip rule save'
2023-12-28T07:09:22.4362305Z b'(00.009924) iptables has nft backend: iptables-save v1.8.10 (nf_tables)'
2023-12-28T07:09:22.4363163Z b''
2023-12-28T07:09:22.4364163Z b'Error (criu/util.c:627): execvp("iptables-legacy-save", ...) failed: No such file or directory'
2023-12-28T07:09:22.4365506Z b'(00.010220) Error (criu/util.c:1653): iptables-legacy-save -V failed'
2023-12-28T07:09:22.4366876Z b'(00.010797) iptables has nft backend: ip6tables-save v1.8.10 (nf_tables)'
2023-12-28T07:09:22.4367766Z b''
2023-12-28T07:09:22.4368764Z b'Error (criu/util.c:627): execvp("ip6tables-legacy-save", ...) failed: No such file or directory'
2023-12-28T07:09:22.4370207Z b'(00.011104) Error (criu/util.c:1653): ip6tables-legacy-save -V failed'
2023-12-28T07:09:22.4371250Z ------------------------ ERROR OVER ------------------------
2023-12-28T07:09:22.4372191Z Running zdtm/static/netns_lock_iptables.hook(--pre-restore)
2023-12-28T07:09:22.4372910Z Run criu restore
2023-12-28T07:09:22.4373574Z =[log]=> dump/zdtm/static/netns_lock_iptables/56/1/restore.log
2023-12-28T07:09:22.4374526Z ------------------------ grep Error ------------------------
2023-12-28T07:09:22.4375470Z b'(00.005490)     56: net: \tRunning ip rule delete table local'
2023-12-28T07:09:22.4376470Z b'(00.006112)     56: net: \tRunning ip rule restore'
2023-12-28T07:09:22.4377609Z b'(00.007341)     56: iptables has nft backend: iptables-restore v1.8.10 (nf_tables)'
2023-12-28T07:09:22.4378506Z b''
2023-12-28T07:09:22.4379494Z b'Error (criu/util.c:627): execvp("iptables-legacy-restore", ...) failed: No such file or directory'
2023-12-28T07:09:22.4381039Z b'(00.007639)     56: Error (criu/util.c:1653): iptables-legacy-restore -V failed'
2023-12-28T07:09:22.4382588Z b'(00.008148)     56: iptables has nft backend: ip6tables-restore v1.8.10 (nf_tables)'
2023-12-28T07:09:22.4383543Z b''
2023-12-28T07:09:22.4384560Z b'Error (criu/util.c:627): execvp("ip6tables-legacy-restore", ...) failed: No such file or directory'
2023-12-28T07:09:22.4386074Z b'(00.008435)     56: Error (criu/util.c:1653): ip6tables-legacy-restore -V failed'
2023-12-28T07:09:22.4387062Z b'(00.011015) net: Unlock network'
2023-12-28T07:09:22.4387808Z b'(00.011017) Running network-unlock scripts'
2023-12-28T07:09:22.4388615Z b'iptables-restore v1.8.10 (nf_tables):'
2023-12-28T07:09:22.4389472Z b'line 5: CHAIN_DEL failed (Resource busy): chain CRIU'
2023-12-28T07:09:22.4390437Z b'(00.031422) Error (criu/util.c:642): exited, status=4'
2023-12-28T07:09:22.4391284Z b'ip6tables-restore v1.8.10 (nf_tables):'
2023-12-28T07:09:22.4392124Z b'line 5: CHAIN_DEL failed (Resource busy): chain CRIU'
2023-12-28T07:09:22.4393047Z b'(00.051362) Error (criu/util.c:642): exited, status=4'
2023-12-28T07:09:22.4393931Z ------------------------ ERROR OVER ------------------------
2023-12-28T07:09:22.4394866Z Running zdtm/static/netns_lock_iptables.hook(--post-restore)
2023-12-28T07:09:22.4396041Z ####### Test zdtm/static/netns_lock_iptables FAIL at hook --post-restore #######

https://github.com/checkpoint-restore/criu/actions/runs/7345320063/job/19998316100

rst0git commented 6 months ago

The following error causes this test to fail in iptables_network_unlock_internal() during restore:

2023-12-28T07:09:22.4387062Z b'(00.011015) net: Unlock network'
2023-12-28T07:09:22.4387808Z b'(00.011017) Running network-unlock scripts'
2023-12-28T07:09:22.4388615Z b'iptables-restore v1.8.10 (nf_tables):'
2023-12-28T07:09:22.4389472Z b'line 5: CHAIN_DEL failed (Resource busy): chain CRIU'

iptables_restore() is using iptables-restore instead of iptables-legacy-restore.

rst0git commented 6 months ago

@avagin This error appears with Alpine v3.19 because iptables-nft is now the default iptables backend. I've opened the following pull request with a fix: https://github.com/checkpoint-restore/criu/pull/2323

rst0git commented 6 months ago

In addition to the error above, nft_run_cmd_from_buffer(nft, buf) fails during restore when running zdtm/static/socket-tcp-nfconntrack with nftables v1.0.9:

2023-12-30T08:09:07.8411066Z ================= Run zdtm/static/socket-tcp-nfconntrack in h ==================
2023-12-30T08:09:07.8411749Z Start test
2023-12-30T08:09:07.8412065Z Test is SUID
2023-12-30T08:09:07.8412988Z ./socket-tcp-nfconntrack --pidfile=socket-tcp-nfconntrack.pid --outfile=socket-tcp-nfconntrack.out
2023-12-30T08:09:07.8414197Z # Warning: table ip filter is managed by iptables-nft, do not touch!
2023-12-30T08:09:07.8414899Z Run criu dump
2023-12-30T08:09:07.8415243Z Run criu restore
2023-12-30T08:09:07.8415893Z =[log]=> dump/zdtm/static/socket-tcp-nfconntrack/55/1/restore.log
2023-12-30T08:09:07.8416745Z ------------------------ grep Error ------------------------
2023-12-30T08:09:07.8417743Z b'(00.008978)     55: iptables has nft backend: ip6tables-restore v1.8.10 (nf_tables)'
2023-12-30T08:09:07.8418542Z b''
2023-12-30T08:09:07.8419357Z b'(00.009508)     55: net: \tRunning iptables-legacy-restore -w for iptables-legacy-restore -w'
2023-12-30T08:09:07.8427088Z b'(00.010251)     55: net: \tRunning ip6tables-legacy-restore -w for ip6tables-legacy-restore -w'
2023-12-30T08:09:07.8428443Z b"(00.011882)     55: Error (criu/util.c:1495): Can't wait or bad status: errno=0, status=65280"
2023-12-30T08:09:07.8429471Z b'(00.011925) Error (criu/cr-restore.c:2557): Restoring FAILED.'
2023-12-30T08:09:07.8430322Z ------------------------ ERROR OVER ------------------------
2023-12-30T08:09:07.8431250Z ######### Test zdtm/static/socket-tcp-nfconntrack FAIL at CRIU restore #########

buf has the following value:

table ip filter {
        chain INPUT {
                type filter hook input priority filter; policy accept;
                counter packets 0 bytes 0 jump CRIU
                iifname "lo" ip protocol tcp xt match "conntrack" counter packets 3 bytes 172 accept
                counter packets 0 bytes 0 drop
        }

        chain CRIU {
                meta mark 0x0000c114 counter packets 0 bytes 0 accept
                counter packets 0 bytes 0 drop
        }

        chain OUTPUT {
                type filter hook output priority filter; policy accept;
                counter packets 0 bytes 0 jump CRIU
        }
}
table ip6 filter {
        chain CRIU {
                meta mark 0x0000c114 counter packets 0 bytes 0 accept
                counter packets 0 bytes 0 drop
        }

        chain INPUT {
                type filter hook input priority filter; policy accept;
                counter packets 0 bytes 0 jump CRIU
        }

        chain OUTPUT {
                type filter hook output priority filter; policy accept;
                counter packets 0 bytes 0 jump CRIU
        }
}

For this input, the nft tool shows the following error:

~ # nft -f test.txt 
test.txt:5:46-65: Error: unsupported xtables compat expression, use iptables-nft with this ruleset
                iifname "lo" ip protocol tcp xt match "conntrack" counter packets 3 bytes 172 accept
                                             ^^^^^^^^^^^^^^^^^^^^

This test works with nftables v1.0.7, where the value of buf is the following

table ip filter {
        chain INPUT {
                type filter hook input priority filter; policy accept;
                counter packets 0 bytes 0 jump CRIU
-                iifname "lo" ip protocol tcp xt match "conntrack" counter packets 3 bytes 172 accept
+                iifname "lo" meta l4proto tcp ct state new,established counter packets 3 bytes 172 accept
                counter packets 0 bytes 0 drop
        }

        chain CRIU {
                meta mark 0x0000c114 counter packets 0 bytes 0 accept   
                counter packets 0 bytes 0 drop
        }

        chain OUTPUT { 
                type filter hook output priority filter; policy accept;
                counter packets 0 bytes 0 jump CRIU
        }
}

The following steps can be used to replicate the problem in Alpine container:

$ nft --version
nftables v1.0.9 (Old Doc Yak #3)
$ iptables-nft --version
iptables v1.8.10 (nf_tables)

$ iptables-translate -A INPUT -i lo -p tcp -m state --state NEW,ESTABLISHED -j ACCEPT
nft 'add rule ip filter INPUT iifname "lo" ip protocol tcp ct state new,established counter accept'
$ iptables-nft -A INPUT -i lo -p tcp -m state --state NEW,ESTABLISHED -j ACCEPT
$ nft list ruleset | tee dump.txt
# Warning: table ip filter is managed by iptables-nft, do not touch!
table ip filter {
    chain INPUT {
        type filter hook input priority filter; policy accept;
        iifname "lo" ip protocol tcp xt match "conntrack" counter packets 0 bytes 0 accept
    }
}

$ nft flush ruleset
$ nft -f dump.txt 
dump.txt:4:32-51: Error: unsupported xtables compat expression, use iptables-nft with this ruleset
        iifname "lo" ip protocol tcp xt match "conntrack" counter packets 0 bytes 0 accept
                                     ^^^^^^^^^^^^^^^^^^^^