ColinIanKing / stress-ng

This is the stress-ng upstream project git repository. stress-ng will stress test a computer system in various selectable ways. It was designed to exercise various physical subsystems of a computer as well as the various operating system kernel interfaces.
https://github.com/ColinIanKing/stress-ng
GNU General Public License v2.0
1.82k stars 290 forks source link

sigsegv test failed with Ubuntu T-3.13 #356

Closed Cypresslin closed 10 months ago

Cypresslin commented 10 months ago

Hello Colin, with recent SRU test for Trusty 3.13, we have found that sigsegv test is failing with Ubuntu Trusty 3.13.0-195, no matter it's a baremetal or VM. And it can be reproduced with 3.13.0-170-generic as well.

Test log:

 Free memory: 7209 MB
 Memory used: 6488 MB
 Using cgroup version 1
 /home/ubuntu/autotest/client/tests/ubuntu_stress_smoke_test/ubuntu_stress_single_smoke_test.sh: line 32: [: too many arguments

 Machine Configuration
 Physical Pages:  2020977
 Pages available: 1845513
 Page Size:       4096
 Zswap enabled:   N

 Free memory:
              total       used       free     shared    buffers     cached
 Mem:       8083908     701980    7381928        444      23280     473656
 -/+ buffers/cache:     205044    7878864
 Swap:      5242872        880    5241992

 Number of CPUs: 4
 Number of CPUs Online: 4

 Maximum bogo ops: 3000

 sigsegv STARTING
 sigsegv RETURNED 2
 sigsegv FAILED
 stress-ng: debug: [18923] invoked with './stress-ng -v -t 5 --sigsegv 4 --sigsegv-ops 3000 --ignite-cpu --syslog --verbose --verify --oomable' by user 0 'root'
 stress-ng: debug: [18923] stress-ng 0.17.04 g955e7a758572
 stress-ng: debug: [18923] system: Linux durin 3.13.0-195-generic #246-Ubuntu SMP Mon Jan 15 15:10:06 UTC 2024 x86_64, gcc 4.8.4, glibc 2.19
 stress-ng: debug: [18923] RAM total: 7.7G, RAM free: 7.0G, swap free: 5.0G
 stress-ng: debug: [18923] temporary file path: '/home/ubuntu/autotest/client/tmp/ubuntu_stress_smoke_test/src/stress-ng', filesystem type: ext2 (226160480 blocks available)
 stress-ng: debug: [18923] CPUs have 3 idle states: C1, C2, POLL
 stress-ng: debug: [18923] 4 processors online, 4 processors configured
 stress-ng: info:  [18923] setting to a 5 secs run per stressor
 stress-ng: debug: [18923] CPU data cache: L1: 32K, L2: 256K, L3: 8192K
 stress-ng: debug: [18923] cache allocate: shared cache buffer size: 8192K
 stress-ng: info:  [18923] dispatching hogs: 4 sigsegv
 stress-ng: debug: [18923] starting stressors
 stress-ng: debug: [18923] 4 stressors started
 stress-ng: debug: [18924] sigsegv: [18924] started (instance 0 on CPU 3)
 stress-ng: debug: [18925] sigsegv: [18925] started (instance 1 on CPU 0)
 stress-ng: debug: [18926] sigsegv: [18926] started (instance 2 on CPU 0)
 stress-ng: debug: [18927] sigsegv: [18927] started (instance 3 on CPU 0)
 stress-ng: fail:  [18924] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: fail:  [18924] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: fail:  [18925] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: fail:  [18925] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: fail:  [18925] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: fail:  [18925] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: fail:  [18924] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: fail:  [18925] sigsegv: expecting fault address 0x8, got 0x10 instead
 info: 5 failures reached, aborting stress process
 stress-ng: fail:  [18924] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: debug: [18925] sigsegv: [18925] exited (instance 1 on CPU 1)
 stress-ng: fail:  [18924] sigsegv: expecting fault address 0x8, got 0x10 instead
 info: 5 failures reached, aborting stress process
 stress-ng: debug: [18924] sigsegv: [18924] exited (instance 0 on CPU 3)
 stress-ng: fail:  [18926] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: error: [18923] sigsegv: [18924] terminated with an error, exit status=2 (stressor failed)
 stress-ng: debug: [18923] sigsegv: [18924] terminated (stressor failed)
 stress-ng: error: [18923] sigsegv: [18925] terminated with an error, exit status=2 (stressor failed)
 stress-ng: fail:  [18926] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: fail:  [18926] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: fail:  [18926] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: debug: [18923] sigsegv: [18925] terminated (stressor failed)
 stress-ng: fail:  [18926] sigsegv: expecting fault address 0x8, got 0x10 instead
 info: 5 failures reached, aborting stress process
 stress-ng: fail:  [18927] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: debug: [18926] sigsegv: [18926] exited (instance 2 on CPU 2)
 stress-ng: fail:  [18927] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: fail:  [18927] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: error: [18923] sigsegv: [18926] terminated with an error, exit status=2 (stressor failed)
 stress-ng: debug: [18923] sigsegv: [18926] terminated (stressor failed)
 stress-ng: fail:  [18927] sigsegv: expecting fault address 0x8, got 0x10 instead
 stress-ng: fail:  [18927] sigsegv: expecting fault address 0x8, got 0x10 instead
 info: 5 failures reached, aborting stress process
 stress-ng: debug: [18927] sigsegv: [18927] exited (instance 3 on CPU 1)
 stress-ng: error: [18923] sigsegv: [18927] terminated with an error, exit status=2 (stressor failed)
 stress-ng: debug: [18923] sigsegv: [18927] terminated (stressor failed)
 stress-ng: debug: [18923] metrics-check: all stressor metrics validated and sane
 stress-ng: info:  [18923] skipped: 0
 stress-ng: info:  [18923] passed: 0
 stress-ng: info:  [18923] failed: 4: sigsegv (4)
 stress-ng: info:  [18923] metrics untrustworthy: 0
 stress-ng: info:  [18923] unsuccessful run completed in 0.01 secs

Bisect shows commit e9ddb1cc6009ac830f1e2daf66cc007ad0af2ae2 is the first bad commit. Sorry for not catching this early on, we seldom test Trusty 3.13 in recent years.

ColinIanKing commented 10 months ago

Good catch, thanks for the bisect information, I'll try and see what's happing.

ColinIanKing commented 10 months ago

I've fixed this and tested on 3.13.0-170-generic. Do you mind giving this a check to see if it fully addresses the problem you are seeing?

Cypresslin commented 10 months ago

Fix verified with both 3.13.0-170 and -195, thank you!