linux-test-project / ltp

Linux Test Project (mailing list: https://lists.linux.it/listinfo/ltp)
https://linux-test-project.readthedocs.io/
GNU General Public License v2.0
2.28k stars 999 forks source link

mm/oom01: xorg exited and test failed #1024

Closed yup-21 closed 1 year ago

yup-21 commented 1 year ago

When I tested oom01, xorg exited, and the dmesg was as follows.

Mar 21 17:16:44 localhost kernel: [ 655.478626] [ 4609] 0 4609 3360 1 458752 25 0 runltp Mar 21 17:16:44 localhost kernel: [ 655.478628] [ 4738] 0 4738 53 1 393216 10 0 ltp-pan Mar 21 17:16:44 localhost kernel: [ 655.478629] [ 5116] 0 5116 3324 0 458752 8 0 tail Mar 21 17:16:44 localhost kernel: [ 655.478631] [ 5285] 0 5285 7640 1 393216 371 0 bamfdaemon Mar 21 17:16:44 localhost kernel: [ 655.478633] [ 120708] 0 120708 48 0 393216 10 -1000 oom01 Mar 21 17:16:44 localhost kernel: [ 655.478636] [ 120709] 0 120709 48 0 393216 12 -1000 oom01 Mar 21 17:16:44 localhost kernel: [ 655.478639] [ 121006] 0 121006 51349 0 786432 883 0 ukui-screensave Mar 21 17:16:44 localhost kernel: [ 655.478641] [ 121050] 0 121050 4681743 3912723 36896768 106 0 oom01 Mar 21 17:16:44 localhost kernel: [ 655.478644] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0-3,global_oom,task_memcg=/user.slice/user-0.slice/session-2.scope,task=oom01,pid=121050,uid=0 Mar 21 17:16:44 localhost kernel: [ 655.478655] Out of memory: Kill process 121050 (oom01) score 925 or sacrifice child Mar 21 17:16:44 localhost kernel: [ 655.486898] Killed process 121050 (oom01) total-vm:299631552kB, anon-rss:250414208kB, file-rss:64kB, shmem-rss:0kB Mar 21 17:16:44 localhost kernel: [ 655.497965] oom_reaper: reaped process 121050 (oom01), now anon-rss:250518720kB, file-rss:0kB, shmem-rss:0kB ... Mar 21 17:16:44 localhost kernel: [ 655.499312] [ 4524] 0 4524 3519 1 393216 46 0 bash Mar 21 17:16:44 localhost kernel: [ 655.499314] [ 4609] 0 4609 3360 1 458752 25 0 runltp Mar 21 17:16:44 localhost kernel: [ 655.499315] [ 4738] 0 4738 53 1 393216 10 0 ltp-pan Mar 21 17:16:44 localhost kernel: [ 655.499316] [ 5116] 0 5116 3324 0 458752 8 0 tail Mar 21 17:16:44 localhost kernel: [ 655.499318] [ 5285] 0 5285 7640 1 393216 371 0 bamfdaemon Mar 21 17:16:44 localhost kernel: [ 655.499319] [ 120708] 0 120708 48 0 393216 10 -1000 oom01 Mar 21 17:16:44 localhost kernel: [ 655.499321] [ 120709] 0 120709 48 0 393216 12 -1000 oom01 Mar 21 17:16:44 localhost kernel: [ 655.499323] [ 121006] 0 121006 51349 0 786432 883 0 ukui-screensave Mar 21 17:16:44 localhost kernel: [ 655.499325] [ 121050] 0 121050 4681743 3914355 36896768 0 0 oom01 Mar 21 17:16:44 localhost kernel: [ 655.499326] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0-3,global_oom,task_memcg=/system.slice/lightdm.service,task=Xorg,pid=2982,uid=0 Mar 21 17:16:44 localhost kernel: [ 655.499333] Out of memory: Kill process 2982 (Xorg) score 0 or sacrifice child Mar 21 17:16:44 localhost kernel: [ 655.507065] Killed process 2982 (Xorg) total-vm:2637504kB, anon-rss:0kB, file-rss:64kB, shmem-rss:0kB Mar 21 17:16:44 localhost kernel: [ 655.517274] oom_reaper: reaped process 2982 (Xorg), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB ... Mar 21 17:16:45 localhost systemd[1]: session-2.scope: Killing process 3413 (lightdm) with signal SIGTERM. Mar 21 17:16:45 localhost systemd[1]: session-2.scope: Killing process 3560 (mate-session) with signal SIGTERM. Mar 21 17:16:45 localhost systemd[1]: session-2.scope: Killing process 3770 (ssh-agent) with signal SIGTERM. Mar 21 17:16:45 localhost systemd[1]: session-2.scope: Killing process 4518 (mate-terminal) with signal SIGTERM. Mar 21 17:16:45 localhost systemd[1]: session-2.scope: Killing process 4609 (runltp) with signal SIGTERM. Mar 21 17:16:45 localhost systemd[1]: session-2.scope: Killing process 4738 (ltp-pan) with signal SIGTERM. Mar 21 17:16:45 localhost systemd[1]: session-2.scope: Killing process 120708 (oom01) with signal SIGTERM. Mar 21 17:16:45 localhost systemd[1]: session-2.scope: Killing process 120709 (oom01) with signal SIGTERM. Mar 21 17:16:45 localhost systemd[1]: session-2.scope: Killing process 121152 (oom01) with signal SIGTERM. Mar 21 17:16:45 localhost systemd[1]: Stopping Session 2 of user root.

Based on the log, it can be seen that when the mlock is executed, the oom is triggered, and the system wants to kill the oom01 process, but it seems to have failed. Subsequently, xorg is killed, causing the desktop to restart, and the test case also fails. How can I resolve this situation?

metan-ucw commented 1 year ago

Generally it's not a good idea to run stress tests from a desktop. These tests are usually run over ssh on a dedicated testing machine.

coolgw commented 1 year ago

Maybe test under text mode can help.