Open pcmoore opened 10 months ago
Unless we can resolve this rather quickly, I think we may need to revert commit 1852fe3d772914d848907f9d0656747776ed3f98 in the release-2.5 branch simply so the various packagers don't see a bunch of test failures when preparing distro packages from a release branch.
Thoughts @drakenclimber?
I think I have an aarch64 box; I'll hack around with it this morning and see if I can uncover anything
Test 29 seems to have been problematic from its inception. I went back to the commit https://github.com/seccomp/libseccomp/commit/51c46f80c1edee863bbc4eb21b03decc44e69a45 that initially added it, and the PFC looks to be incorrect from day 1.
$ ./tests/29-sim-pseudo_syscall
#
# pseudo filter code start
#
# filter for arch x86 (1073741827)
if ($arch == 1073741827)
# default action
action ALLOW;
# invalid architecture action
action KILL;
#
# pseudo filter code end
#
Stating the obvious here - these rule lines have never been added to the filter in any incarnation of libseccomp:
rc = seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(sysmips), 0);
if (rc < 0)
goto out;
rc = seccomp_rule_add_exact(ctx, SCMP_ACT_KILL, SCMP_SYS(sysmips), 0);
if (rc == 0)
goto out;
/* -10001 == 4294957295 (unsigned) */
rc = seccomp_rule_add_exact(ctx, SCMP_ACT_KILL, -11001, 0);
if (rc == 0)
goto out;
Commit https://github.com/seccomp/libseccomp/commit/1852fe3d772914d848907f9d0656747776ed3f98 broke the PFC generation and resulted in no PFC being generated at all. I did a little hacking, and it looks like the lookup of -10001 in the gperf code mapped to arch_prctl()
(syscall 384 on x86). When the test was using -11001, it was translated to sysmips()
which is a PNR on x86.
Based on the above findings, I think the safest and most prudent solution is:
release-2.5
branchReverted in the release-2.5 branch in commit 970c2b4b0c02. Since the test has been effectively broken since its inception, I've changed the milestone on this to v2.6.0. There seems to be significant issues with the test, and I doubt we will cherry-pick then into the release-2.5 branch
On LoongArch 64-bit the same failures happen too.
hello, @drakenclimber . How about to change the condition from if (rc == 0) to if (rc == 0 && (seccomp_arch_native() == SCMP_ARCH_X86)) to filter out the native arch ? Only arch x86 x86_64 and x32 return rc < 0, the other arch return rc = 0.
First off, thanks to everyone who spent time investigating this and thinking about solutions, that was helpful.
I took a look at this today and @drakenclimber's investigation seems to have hit at the core problem: the bogus negative syscall number chosen for test 29 isn't actually bogus on a number of systems due to the PNR mapping. On aarch64 the bogus syscall number maps to a valid pseudo-syscall/PNR, arch_prctl()
and as @drakenclimber already stated it's a valid syscall on x86.
Considering that test 29 is intended to attempt loading of a bogus negative syscall, I think the easiest solution is to choose a negative syscall number well outside the range of the PNR values. Preliminary testing has shown this to resolve the problem, I'll post a PR shortly.
It appears that commit 1852fe3d772914d848907f9d0656747776ed3f98 uncovered an issue on aarch64: