flux-framework / flux-core

core services for the Flux resource management framework
GNU Lesser General Public License v3.0
167 stars 50 forks source link

testsuite: add CI coverage for memlimit test #5366

Open garlick opened 1 year ago

garlick commented 1 year ago

Problem: #5359 adds a test t2410-sdexec-memlimit.t that is only run by the el8,system builder in CI, but it is currently skipped.

grondo commented 3 months ago

Update: After swapping the docker-run-systest.sh script to use podman --systemd=always ... the el8,system builder does not skip this sharness test, however it generates the following errors from the first (and all subsequent tests)

Jun 14 15:42:44.891567 UTC sdexec.debug[0]: watch 4ed92c07-f7e8.service
Jun 14 15:42:44.891593 UTC sdexec.debug[0]: start 4ed92c07-f7e8.service
Jun 14 15:42:44.911176 UTC sdexec.debug[0]: 4ed92c07-f7e8.service: unknown.unknown
Jun 14 15:42:44.911257 UTC sdexec.debug[0]: 4ed92c07-f7e8.service: activating.start
Jun 14 15:42:44.911976 UTC sdexec.debug[0]: 4ed92c07-f7e8.service: active.running
cat: /sys/fs/cgroup//user.slice/user-1001.slice/user@1001.service/4ed92c07-f7e8.service/memory.high: No such file or directory
Jun 14 15:42:44.917370 UTC sdexec.debug[0]: 4ed92c07-f7e8.service: active.running
Jun 14 15:42:44.917435 UTC sdexec.debug[0]: 4ed92c07-f7e8.service: failed.failed
Jun 14 15:42:44.917441 UTC sdexec.debug[0]: reset-failed 4ed92c07-f7e8.service
Jun 14 15:42:44.918027 UTC sdexec.debug[0]: 4ed92c07-f7e8.service: inactive.dead
Jun 14 15:42:44.918061 UTC sdexec.debug[0]: unwatch 4ed92c07-f7e8.service
0: Exit 1

This implies all the skip_all checks are passing, but the actual memory values aren't showing up in the service cgroup in the Github CI environement.

(Capturing this here in case anyone ever wants to dig into it)