seccomp / libseccomp

The main libseccomp repository
GNU Lesser General Public License v2.1
791 stars 170 forks source link

BUG: test 29 is broken on aarch64 #418

Open pcmoore opened 10 months ago

pcmoore commented 10 months ago

It appears that commit 1852fe3d772914d848907f9d0656747776ed3f98 uncovered an issue on aarch64:

% ./regression -m c -T bpf-sim -b 29-sim-pseudo_syscall
=============== Mon Oct 23 04:51:37 PM EDT 2023 ===============
Regression Test Report ("regression -m c -T bpf-sim -b 29-sim-pseudo_syscall")
 batch name: 29-sim-pseudo_syscall
 test mode:  c
 test type:  bpf-sim
bpf-sim
ERROR
Test 29-sim-pseudo_syscall%%001-00001 result:   ERROR bpf_sim rc=8
ERROR
Test 29-sim-pseudo_syscall%%001-00002 result:   ERROR bpf_sim rc=8
ERROR
Test 29-sim-pseudo_syscall%%001-00003 result:   ERROR bpf_sim rc=8
ERROR
Test 29-sim-pseudo_syscall%%001-00004 result:   ERROR bpf_sim rc=8
ERROR
Test 29-sim-pseudo_syscall%%001-00005 result:   ERROR bpf_sim rc=8
ERROR
Test 29-sim-pseudo_syscall%%001-00006 result:   ERROR bpf_sim rc=8
ERROR
Test 29-sim-pseudo_syscall%%001-00007 result:   ERROR bpf_sim rc=8
ERROR
Test 29-sim-pseudo_syscall%%001-00008 result:   ERROR bpf_sim rc=8
ERROR
Test 29-sim-pseudo_syscall%%001-00009 result:   ERROR bpf_sim rc=8
ERROR
Test 29-sim-pseudo_syscall%%001-00010 result:   ERROR bpf_sim rc=8
ERROR
Test 29-sim-pseudo_syscall%%001-00011 result:   ERROR bpf_sim rc=8
bpf-sim
ERROR
Test 29-sim-pseudo_syscall%%002-00001 result:   ERROR bpf_sim rc=8
bpf-sim
ERROR
Test 29-sim-pseudo_syscall%%003-00001 result:   ERROR bpf_sim rc=8
 test mode:  c
 test type:  bpf-valgrind
Regression Test Summary
 tests run: 13
 tests skipped: 0
 tests passed: 0
 tests failed: 0
 tests errored: 13
============================================================
pcmoore commented 9 months ago

Unless we can resolve this rather quickly, I think we may need to revert commit 1852fe3d772914d848907f9d0656747776ed3f98 in the release-2.5 branch simply so the various packagers don't see a bunch of test failures when preparing distro packages from a release branch.

Thoughts @drakenclimber?

drakenclimber commented 9 months ago

I think I have an aarch64 box; I'll hack around with it this morning and see if I can uncover anything

drakenclimber commented 9 months ago

Test 29 seems to have been problematic from its inception. I went back to the commit https://github.com/seccomp/libseccomp/commit/51c46f80c1edee863bbc4eb21b03decc44e69a45 that initially added it, and the PFC looks to be incorrect from day 1.

$ ./tests/29-sim-pseudo_syscall
#
# pseudo filter code start
#
# filter for arch x86 (1073741827)
if ($arch == 1073741827)
  # default action
  action ALLOW;
# invalid architecture action
action KILL;
#
# pseudo filter code end
#

Stating the obvious here - these rule lines have never been added to the filter in any incarnation of libseccomp:

    rc = seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(sysmips), 0);
    if (rc < 0)
        goto out;
    rc = seccomp_rule_add_exact(ctx, SCMP_ACT_KILL, SCMP_SYS(sysmips), 0);
    if (rc == 0)
        goto out;
    /* -10001 == 4294957295 (unsigned) */
    rc = seccomp_rule_add_exact(ctx, SCMP_ACT_KILL, -11001, 0);
    if (rc == 0)
        goto out;

Commit https://github.com/seccomp/libseccomp/commit/1852fe3d772914d848907f9d0656747776ed3f98 broke the PFC generation and resulted in no PFC being generated at all. I did a little hacking, and it looks like the lookup of -10001 in the gperf code mapped to arch_prctl() (syscall 384 on x86). When the test was using -11001, it was translated to sysmips() which is a PNR on x86.

drakenclimber commented 9 months ago

Based on the above findings, I think the safest and most prudent solution is:

drakenclimber commented 9 months ago

Reverted in the release-2.5 branch in commit 970c2b4b0c02. Since the test has been effectively broken since its inception, I've changed the milestone on this to v2.6.0. There seems to be significant issues with the test, and I doubt we will cherry-pick then into the release-2.5 branch

xry111 commented 9 months ago

On LoongArch 64-bit the same failures happen too.

lixing-star commented 1 week ago

hello, @drakenclimber . How about to change the condition from if (rc == 0) to if (rc == 0 && (seccomp_arch_native() == SCMP_ARCH_X86)) to filter out the native arch ? Only arch x86 x86_64 and x32 return rc < 0, the other arch return rc = 0.

pcmoore commented 19 hours ago

First off, thanks to everyone who spent time investigating this and thinking about solutions, that was helpful.

I took a look at this today and @drakenclimber's investigation seems to have hit at the core problem: the bogus negative syscall number chosen for test 29 isn't actually bogus on a number of systems due to the PNR mapping. On aarch64 the bogus syscall number maps to a valid pseudo-syscall/PNR, arch_prctl() and as @drakenclimber already stated it's a valid syscall on x86.

Considering that test 29 is intended to attempt loading of a bogus negative syscall, I think the easiest solution is to choose a negative syscall number well outside the range of the PNR values. Preliminary testing has shown this to resolve the problem, I'll post a PR shortly.