Closed sporksmith closed 3 years ago
The full test output is over a gig. Filtered to just the failing output with:
tac LastTest.log | sed -n '/Test Failed./,/Testing: /p' | tac > failing_tests.txt
From failing_tests.txt, above, only two of the tests had output.
Tail of file-shadow-preload:
00:00:00.014087 [thread-0] 00:00:01.000000000 [debug] [testnode~11.0.0.1] [syscall_handler.c:118] [_syscallhandler_post_syscall] SYSCALL_HANDLER_POST(testfile,pid=1000): syscall 228 clock_gettime result: state=DONE code=0
00:00:00.014234 [thread-0] 00:00:01.000000000 [debug] [testnode~11.0.0.1] [thread_preload.c:239] [_threadpreload_waitForNextEvent] received shim_event 3
00:00:00.014246 [thread-0] 00:00:01.000000000 [debug] [testnode~11.0.0.1] [syscall_handler.c:103] [_syscallhandler_pre_syscall] SYSCALL_HANDLER_PRE(testfile,pid=1000): handling syscall 228 clock_gettime
00:00:00.014252 [thread-0] 00:00:01.000000000 [debug] [testnode~11.0.0.1] [time.c:84] [syscallhandler_clock_gettime] syscallhandler_clock_gettime with 1 0x7ffd09ebc140
00:00:00.014258 [thread-0] 00:00:01.000000000 [debug] [testnode~11.0.0.1] [syscall_handler.c:118] [_syscallhandler_post_syscall] SYSCALL_HANDLER_POST(testfile,pid=1000): syscall 228 clock_gettime result: state=DONE code=0
Tail of cpp-shadow-preload:
00:00:00.011583 [thread-0] 00:00:02.000000000 [debug] [testnode~11.0.0.1] [thread_preload.c:239] [_threadpreload_waitForNextEvent] received shim_event 3
00:00:00.011594 [thread-0] 00:00:02.000000000 [debug] [testnode~11.0.0.1] [syscall_handler.c:103] [_syscallhandler_pre_syscall] SYSCALL_HANDLER_PRE(testcpp,pid=1000): handling syscall 228 clock_gettime
00:00:00.011607 [thread-0] 00:00:02.000000000 [debug] [testnode~11.0.0.1] [time.c:84] [syscallhandler_clock_gettime] syscallhandler_clock_gettime with 0 0x7ffe86f62e30
00:00:00.011614 [thread-0] 00:00:02.000000000 [debug] [testnode~11.0.0.1] [syscall_handler.c:118] [_syscallhandler_post_syscall] SYSCALL_HANDLER_POST(testcpp,pid=1000): syscall 228 clock_gettime result: state=DONE code=0
Tail of file.shadow-preload.dastnode/testnode.testfile.1000.stderr:
00:00:00.014303 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:182] [recv] Making real syscall recvfrom
00:00:00.014337 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:53] [_shadow_syscall_event] got response of type 6 on 7
00:00:00.014347 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:50] [_shadow_syscall_event] waiting for event on 7
00:00:00.014354 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:182] [recv] Making real syscall recvfrom
00:00:00.014363 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:53] [_shadow_syscall_event] got response of type 4 on 7
file.shadow-preload.data/hosts/testnode/testnode.testfile.1000.stdout:
/file/newfile: OK
/file/open_close: OK
/file/write: OK
/file/read: OK
/file/fwrite: OK
/file/fread: OK
/file/iov:
The cpp test looks like the test program finished successfully, but then shadow hung.
$ cat cpp.shadow-preload.data/hosts/testnode/testnode.testcpp.1000.stderr
00:00:00.010395 [00:00:00.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/shim.c:119] [_shim_load] waiting for event on 7
00:00:00.010446 [00:00:00.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:182] [recv] Making real syscall recvfrom
00:00:00.010455 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/shim.c:122] [_shim_load] starting main
00:00:00.010558 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:159] [clock_gettime] Making interposed syscall clock_gettime
00:00:00.010565 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:45] [_shadow_syscall_event] sending syscall event 228 on 7
00:00:00.010570 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:184] [send] Making real syscall sendto
00:00:00.010581 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:50] [_shadow_syscall_event] waiting for event on 7
00:00:00.010586 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:182] [recv] Making real syscall recvfrom
00:00:00.010775 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:53] [_shadow_syscall_event] got response of type 7 on 7
00:00:00.010801 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:176] [open_explicit] Making real syscall open
00:00:00.010820 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:160] [close] Making real syscall close
00:00:00.010835 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:184] [send] Making real syscall sendto
00:00:00.010845 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:50] [_shadow_syscall_event] waiting for event on 7
00:00:00.010850 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:182] [recv] Making real syscall recvfrom
00:00:00.010856 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:53] [_shadow_syscall_event] got response of type 6 on 7
00:00:00.010860 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:50] [_shadow_syscall_event] waiting for event on 7
00:00:00.010864 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:182] [recv] Making real syscall recvfrom
00:00:00.010869 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:53] [_shadow_syscall_event] got response of type 4 on 7
00:00:00.010874 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:174] [nanosleep] Making interposed syscall nanosleep
00:00:00.010894 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:45] [_shadow_syscall_event] sending syscall event 35 on 7
00:00:00.011160 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:184] [send] Making real syscall sendto
00:00:00.011174 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:50] [_shadow_syscall_event] waiting for event on 7
00:00:00.011182 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:182] [recv] Making real syscall recvfrom
00:00:00.011208 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:53] [_shadow_syscall_event] got response of type 5 on 7
00:00:00.011218 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:184] [send] Making real syscall sendto
00:00:00.011230 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:50] [_shadow_syscall_event] waiting for event on 7
00:00:00.011241 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:182] [recv] Making real syscall recvfrom
00:00:00.011369 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:53] [_shadow_syscall_event] got response of type 5 on 7
00:00:00.011380 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:184] [send] Making real syscall sendto
00:00:00.011394 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:50] [_shadow_syscall_event] waiting for event on 7
00:00:00.011402 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:182] [recv] Making real syscall recvfrom
00:00:00.011516 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:53] [_shadow_syscall_event] got response of type 6 on 7
00:00:00.011526 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:50] [_shadow_syscall_event] waiting for event on 7
00:00:00.011533 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:182] [recv] Making real syscall recvfrom
00:00:00.011541 [00:00:01.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:53] [_shadow_syscall_event] got response of type 4 on 7
00:00:00.011548 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:159] [clock_gettime] Making interposed syscall clock_gettime
00:00:00.011555 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:45] [_shadow_syscall_event] sending syscall event 228 on 7
00:00:00.011561 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:184] [send] Making real syscall sendto
00:00:00.011576 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:50] [_shadow_syscall_event] waiting for event on 7
00:00:00.011583 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:182] [recv] Making real syscall recvfrom
00:00:00.011637 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:53] [_shadow_syscall_event] got response of type 7 on 7
00:00:00.011649 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:184] [send] Making real syscall sendto
00:00:00.011659 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:50] [_shadow_syscall_event] waiting for event on 7
00:00:00.011666 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:182] [recv] Making real syscall recvfrom
00:00:00.011675 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:53] [_shadow_syscall_event] got response of type 6 on 7
00:00:00.011680 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:50] [_shadow_syscall_event] waiting for event on 7
00:00:00.011684 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:182] [recv] Making real syscall recvfrom
00:00:00.011700 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:53] [_shadow_syscall_event] got response of type 4 on 7
00:00:04.973441 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/shim.c:135] [_shim_unload] sending stop event on 7
00:00:04.973462 [00:00:02.000000000] [shd-shim] [debug] [/home/jnewsome/projects/shadow/src/shim/preload_syscalls.c:184] [send] Making real syscall sendto
$ cat cpp.shadow-preload.data/hosts/testnode/testnode.testcpp.1000.stdout
########## cpp test starting ##########
########## running test: _test_init()
hey 5
hey 0x562b626e2020 0x562b626e2020
test 2 4
test 0x562b626e2068 0x562b626e2068
local test 2 4
local test 0x7ffe86f62e80 0x7ffe86f62e80
########## running test: _test_endl()
Line 1
Line 2
Line 3
Line 1
Line 2
Line 3
Line 1
Line 2
Line 3
########## running test: _test_time()
########## cpp test passed! ##########
Unfortunately it seems whatever ctest uses to kill timed out tests doesn't generate core files, (despite running with ulimit -c unlimited
).
One thing I noticed when trying to reproduce is that I could only do so while running multiple tests in parallel. e.g. I was able to run file-shadow-preload by itself 1000x successfully.
Hmm, actually I was able to reproduce in ptrace tests as well. (A previous attempt running all shadow tests didn't repro).
ctest -R shadow-ptrace -j8 --repeat-until-fail 1000 --output-on-failure --timeout 1
56% tests passed, 4 tests failed out of 9
Total Test time (real) = 151.35 sec
The following tests FAILED:
14 - file-shadow-ptrace (Timeout)
20 - shutdown-shadow-ptrace (Timeout)
29 - udp-shadow-ptrace (Timeout)
32 - unistd-shadow-ptrace (Timeout)
Errors while running CTest
ptrace test log looks similar:
file-shadow-ptrace:
00:00:00.017771 [thread-0] 00:00:01.000000000 [debug] [testnode~11.0.0.1] [syscall_handler.c:118] [_syscallhandler_post_syscall] SYSCALL_HANDLER_POST(testfile,pid=1000): syscall 3 close result: state=NATIVE code=0
00:00:00.017780 [thread-0] 00:00:01.000000000 [debug] [testnode~11.0.0.1] [thread_ptrace.c:462] [threadptrace_resume] THREAD_PTRACE_CHILD_STATE_SYSCALL_POST
00:00:00.017790 [thread-0] 00:00:01.000000000 [debug] [testnode~11.0.0.1] [thread_ptrace.c:438] [threadptrace_resume] THREAD_PTRACE_CHILD_STATE_SYSCALL_PRE
00:00:00.017793 [thread-0] 00:00:01.000000000 [debug] [testnode~11.0.0.1] [syscall_handler.c:183] [syscallhandler_make_syscall] native syscall 257 openat
<end of output>
Test time = 1.01 sec
udp-shadow-ptrace
00:00:00.030622 [thread-0] 00:00:01.000000000 [message] [localnode~11.0.0.1] [process.c:249] [_process_start] process 'localnode.testudp.1000' started in 0.017668 seconds
00:00:00.030630 [thread-0] 00:00:01.000000000 [message] [localnode~11.0.0.1] [process.c:175] [_process_check] process 'localnode.testudp.1000' has completed or is otherwise no longer running
00:00:00.030636 [thread-0] 00:00:01.000000000 [message] [localnode~11.0.0.1] [process.c:147] [_process_logReturnCode] main success code '0' for process 'localnode.testudp.1000'
00:00:00.030644 [thread-0] 00:00:01.000000000 [message] [localnode~11.0.0.1] [process.c:183] [_process_check] total runtime for process 'localnode.testudp.1000' was 0.017668 seconds
<end of output>
Test time = 4.60 sec
This is a bit stale; I think some of the related issues have been fixed. Closing in favor of #1252
The shadow-preload tests are a bit flaky. This is masked a bit in our CI because we automatically retry failed tests once, and allow the workflow to succeed if they pass on the second try.
I've seen these failures occasionally in the CI, but don't have a pointer handy to one. I was able to reproduce locally with:
Output:
Description: Ubuntu 18.04.4 LTS
Linux vusiri 5.3.0-53-generic #47~18.04.1-Ubuntu SMP Thu May 7 13:10:50 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Shadow v1.13.2-883-ged221d07-dirty 2020-06-05 (built 2020-06-05) running GLib v2.56.4 and IGraph v0.7.1