DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.67k stars 561 forks source link

Fix flaky tests on ARM #2416

Open fhahn opened 7 years ago

fhahn commented 7 years ago

The following tests are currently flaky on some ARM/AArch32 hardware

code_api|tool.histogram.offline 
code_api|linux.sigaction_nosignals 
code_api|linux.signal_race 
code_api|tool.drcacheoff.simple 
code_api|tool.histogram.gzip 
fhahn commented 7 years ago

xref #2075

fhahn commented 6 years ago

I am not sure if histogram.offline fails because it is flaky. The problem seems to be a cmake error:

153/153 Test #146: code_api|tool.histogram.offline .................***Failed   90.12 sec
Running cmd |/home/jenkins-agent/workspace/DynamoRIO-AArch32-Precommit/test-run/build_debug-internal-32/bin32/drrun;-s;90;-quiet;-debug;-killpg;-stderr_mask;0xC;-dumpcore_mask;0;-code_api;-t;drcachesim;-offline;--;/home/jenkins-agent/workspace/DynamoRIO-AArch32-Precommit/test-run/build_debug-internal-32/suite/tests/bin/pthreads.ptsig|
CMake Error at /home/jenkins-agent/workspace/DynamoRIO-AArch32-Precommit/suite/tests/runmulti.cmake:106 (message):
  *** cmd failed (9): ***

Call Stack (most recent call first):
  /home/jenkins-agent/workspace/DynamoRIO-AArch32-Precommit/suite/tests/runmulti.cmake:115 (process_cmdline)

@derekbruening do you have any idea what the problem could be? Or how to go about debugging this? Maybe a package is missing or something?

derekbruening commented 6 years ago

We use cmake scripts for more than just configuring the build: we use them to run some tests. runmulti.cmake is invoked at testing time for this test and it runs several command lines. It printed out the line it's running that failed with exit code 9, which is strange, and with nothing in stderr (printed after the :) which is also strange.

The simplest way to debug these cmake scripts in general is "printf debugging" by adding message("foo"). Here though I would try running that exact command line (s/;/ /): can you reproduce the 9 exit code failure?

fhahn commented 6 years ago

code_api|pthreads.ptsig is flaky on ARM too.

derekbruening commented 6 years ago

linux.reset failed for #2938:

http://jenkins.dynamorio.org:8080/job/DynamoRIO-AArch32-Precommit/84/console

78/154 Test  #77: code_api|linux.reset ............................***Failed  Required regular expression not found.Regex=[^starting
done
$
]  2.50 sec
CMake Error at /home/jenkins-agent/workspace/DynamoRIO-AArch32-Precommit/suite/tests/runall.cmake:177 (message):
  *** kill failed (1): kill: (1868): No such process

  ***