eclipse-bluechi / bluechi

Eclipse BlueChi is a systemd service controller intended for multi-node environments with a predefined number of nodes and with a focus on highly regulated ecosystems such as those requiring functional safety.
https://bluechi.readthedocs.io/en/latest/
GNU Lesser General Public License v2.1
130 stars 37 forks source link

Test `bluechi-agent-user-bus` fails in the testing farm #900

Open mkemel opened 3 months ago

mkemel commented 3 months ago

Describe the bug

The test bluechi-agent-user-bus is failing in the testing farm, while passing in other environments (GitHub, locally). The test was disabled in the testing farm until this issue is handled.

The test tries to start bluechi-agent.service as a user in an agent container, which fails. In the agent journal log you can find the following entries, not seen when running the test in other environments:

Apr 30 11:48:59 5812f46af2f4 systemd[1]: Finished User Runtime Directory /run/user/1000.
Apr 30 11:48:59 5812f46af2f4 systemd[1]: Starting User Manager for UID 1000...
Apr 30 11:48:59 5812f46af2f4 systemd[60]: pam_loginuid(systemd-user:session): Error writing /proc/self/loginuid: Operation not permitted
Apr 30 11:48:59 5812f46af2f4 systemd[60]: pam_loginuid(systemd-user:session): set_loginuid failed
Apr 30 11:48:59 5812f46af2f4 systemd[60]: pam_unix(systemd-user:session): session opened for user bluechiuser(uid=1000) by root(uid=0)
Apr 30 11:48:59 5812f46af2f4 systemd[60]: PAM failed: Cannot make/remove an entry for the specified session
Apr 30 11:48:59 5812f46af2f4 systemd[60]: user@1000.service: Failed to set up PAM session: Operation not permitted
Apr 30 11:48:59 5812f46af2f4 systemd[60]: user@1000.service: Failed at step PAM spawning /usr/lib/systemd/systemd: Operation not permitted
Apr 30 11:48:59 5812f46af2f4 systemd[1]: user@1000.service: Main process exited, code=exited, status=224/PAM
Apr 30 11:48:59 5812f46af2f4 systemd[1]: user@1000.service: Failed with result 'exit-code'.
Apr 30 11:48:59 5812f46af2f4 systemd[1]: Failed to start User Manager for UID 1000.

To Reproduce

Could not reproduce outside the testing farm. To reproduce, re-enable the test for the testing farm by removing the tag: ... entry in tests/tests/tier0/bluechi-agent-user-bus/main.fmf and creating a PR.

Expected behavior

The test should pass in all environments

mkemel commented 3 months ago

When this issue is resolved, the tag testing-farm-container which was added in #899 - can be removed

mkemel commented 3 months ago

Running the test on multihost, the test fails as well, but I don't see the same auth errors in the node journal log. Needs some more research