need a way to resume from the TEST that crashed

GoogleCodeExporter commented 9 years ago

Requested by Timur Iskhodzhanov <timurrrr@google.com>:

Consider we have a large number of tests in one executable (e.g.
Chromium unit_tests or ui_tests)

Use case:
One's local change has introduced a failure in a test X.Y which is
executed somewhere in the middle of the test run
(e.g. the whole binary takes 10 minutes, the failure happens after 5
minutes of running).

If he/she is sure the tests that are executed before X.Y are not
affected by the fix - he/she should be able to skip them and run only
X.Y and the following tests.

This feature will be especially useful for those who develop Valgrind-
like tools and test them using binaries based on googletest.
In this case it's quite common that some particular test is failing
when run under the tool and when it's fixed we should run only those
tests that follow X.Y

By "failure" I meant a crash or non-googletest assertion.
So the execution stops when X.Y fails.

Example: Chromium code uses CHECKs inside its code and as soon as they
fail, it prints out the stack trace and aborts the execution.
Another example: if there is a bug in Valgrind-like tool, it may crash
in an internal assertion when running X.Y and stop the test execution.

Original issue reported on code.google.com by zhanyong...@gmail.com on 7 Sep 2010 at 7:07

GoogleCodeExporter commented 9 years ago

I think I know another related feature which would be really cool to get.
In Chromium (and many more projects) we have ASSERT_* and CHECK/DCHECK macros.

ASSERT_* is from googletest - it stops the current test if the condition is not 
met but the next test continues.

CHECK is situated inside the Chromium code itself and when its condition is not 
met - things are really bad and usually the only correct solution is to crash 
and abort all the tests (e.g. memory can already been corrupted).
We can tweak googletest so it automatically continues from the next test if 
some test crashes on a CHECK.

Original comment by timurrrr@google.com on 17 Sep 2010 at 2:53

GoogleCodeExporter commented 9 years ago

Bumping up the priority due to the level of interest.

Original comment by w...@google.com on 27 Sep 2010 at 6:19

Added labels: Priority-High
Removed labels: Priority-Medium

GoogleCodeExporter commented 9 years ago

Eric Fellheimer had a patch for catching permature exit in tests.  We can 
extend the idea to record the crashing test method name somewhere, such that a 
runner script can automatically use --gtest_start=FooTest.Bar+ to resume a 
crashed test program.  (The + sign after FooTest.Bar means to start *after*, 
instead of *at*, the specified test method.)

Original comment by w...@google.com on 6 Oct 2010 at 7:22

GoogleCodeExporter commented 9 years ago

hi, i believe this would be an excellent feature. 

if we are running all the unit tests and getting them to pass, its annoying to 
fail near the end and have to start again from the beginning.

Original comment by mx2323@gmail.com on 2 Dec 2010 at 10:23

GoogleCodeExporter commented 9 years ago

Issue 342 has been merged into this issue.

Original comment by w...@google.com on 14 Dec 2010 at 11:41

GoogleCodeExporter commented 9 years ago

Check out this script that runs tests in subprocesses and resumes execution 
after a test crashes. It does modify the output slightly, but not that much. 
Let me know if you find it useful.

Original comment by vladlosev on 22 Dec 2010 at 10:07

Attachments:

crashsafe.py

GoogleCodeExporter commented 9 years ago

Another attempt at file upload.

Original comment by vladlosev on 22 Dec 2010 at 10:21

Attachments:

crashsafe.py

GoogleCodeExporter commented 9 years ago

The script looks good, but what exactly is done with the XML output in case of 
a crash?

Original comment by johannes...@googlemail.com on 3 Jan 2011 at 7:41

GoogleCodeExporter commented 9 years ago

XML output is not supported yet. The script will have to parse XML files 
produced by each run and combine them. In the case of a crash, the runner will 
need to re-run the affected batch of tests to get the correct results. If 
people find that useful, I will go ahead and implement that.

Original comment by vladlosev on 13 Jan 2011 at 11:11

GoogleCodeExporter commented 9 years ago

I would be interested. But wouldn't be my solution given in 342 be much easier 
to implement? ;)

Original comment by johannes...@googlemail.com on 13 Jan 2011 at 11:14

GoogleCodeExporter commented 9 years ago

Hi,
Can anyone please confirm that the script posted by vladlosev is the 
recommended way to handle crashes in cases as mentioned above. IS it a part of 
GTest Project?
I am a newbie in using gTest as well as in python. Can anyone please direct me 
to some resources which can help me in figuring out how to use this script.
Thanks for your help.

Original comment by himanshu...@gmail.com on 21 Jul 2011 at 10:52

GoogleCodeExporter commented 9 years ago

Please send issues or requests to
http://groups.google.com/group/googletestframework *instead of here*.
This issue tracker is not regularly monitored.

That said: Yes, the script is currently the best way to handle crashes and 
resuming executing later unit tests. You can use it via "crashsafe.py 
binary_name [google test flags]". I personally would not consider it as part of 
the googletest framework as it is limited in use (no xml output supported) and 
not rigorously tested.

Original comment by dirk....@gmail.com on 22 Jul 2011 at 9:05

GoogleCodeExporter commented 9 years ago

I've created this to try out the approach. If there is enough interest from 
people in this script, it can be made part of the distribution.

Original comment by vladlosev on 24 Jul 2011 at 8:02

GoogleCodeExporter commented 9 years ago

For me it still is important to have XML output. I would prefer a solution that 
is directly integrated into the core and does not add a level of around it that 
can become another potential cause of errors.

Original comment by johannes...@googlemail.com on 24 Jul 2011 at 8:59

GoogleCodeExporter commented 9 years ago

For me, a good solutions for this issue and issue 348 (timeout) would be really 
useful. I follow johannes comment: XML support is crucial, especially for 
everyone using googletest together with Hudson/Jenkins.

A solution in the core has two advantages:
1) It is easier to use. It might be a simple flag like --gtest_crash_safe. 
2) With an external solutions, a lot logic is duplicated. E.g. the output logic 
has to be in sync in the core and the external script. I think a solutions in 
the core may be more complex in the first place but easier to maintain in the 
long term. A solution in the core might build on the infrastructure already in 
place for the death tests (forking into subprocesses, ...) .

However, I understand the advantage of the script solution, too. The core is 
not touched at all, so there is no risk to break any existing system. 
Considering how import googletest is for a lot of people such a risk-minimizing 
solution has its benefits.

I mention the issue 348 here because a solution for this issue probably 
directly leads to an easy solution for #348.

Original comment by dirk....@gmail.com on 24 Jul 2011 at 9:34

GoogleCodeExporter commented 9 years ago

I would prefer to have a core solution too, but the XML output is not a 
priority for me now. Probably integrating it with XML may be postponed? IMO 
implementing the core solution for the no-XML case should be pretty 
straightforward.

Original comment by timurrrr@google.com on 25 Jul 2011 at 8:58

GoogleCodeExporter commented 9 years ago

I wanted to chime in with some experience using an external python script as 
well. I do not want to say that an internal solution isn't needed but some 
information I might have could be useful.

We are using gtest to write unit tests (and unit-like tests) for an internal 
API which has a host or state that is setup once and torn down once per run. 
There is the possibility that the internal host could be in a corrupt state 
either because of the tests or bugs in the host. Sometimes this causes 
cascading failures where everything after a given test will fail. To alleviate 
the problem, we control the process from a python script similar to the one 
suggested from vladlosev.

1. Runs the exe with --gtest_list_tests and pipes to a file.
2. Reruns the exe with all the tests and dumps the output to another file.
3. Compare the printed output to the output from the list of tests to determine 
Passing Tests, Failing Tests, Crashing Tests, and Tests that did not run. 
4. Suggest exact command lines to run the specific problematic tests.

By doing this we can verify if tests passed, failed, or if one crashed and 
never finished. Part of this might be a workaround for not being able to resume 
tests on a crash. However in our environment, even if a test crashed and gtest 
was able to go on, there is always the cascading failure problem which may or 
may not because the internal state of the host was screwed up. In that case we 
can rerun the tests and skip ones we already completed up to the first one that 
failed and skip it with the --gtest_filter. Having a solution in gtest most 
likely will not fix that. Then again you could say that our environment the 
tests are not mutually exclusive in that they are all using the same internal 
state of the host - which is true but we are using gtest for more than just 
unit tests.  

Another thing we do is leak detection. We have an internal tool that patches 
all new, malloc calls to be redirected and keeps stack traces of each one. Then 
we can dump those listings based on number of occurrences or size. One 
interesting thing that came out of using the python script to control 
everything was the ability to control the tests in a way to run a test multiple 
times and look at what was potentially leaked. The strategy then turns into:

1. Run the exe with --gtest_list_tests and pipe to a file to get list of 
available tests.
2. Iterate over the tests names and call the leak detection tool - basically 
launch the process once per test:

    Tests.exe --gtest_filter=Group1.TestName1 --gtest_repeat=5 --our_internal_memleak_flag > Group1.TestName1.leakresults.txt

3. Iterate over output files and examine for memleaks, again suggest command 
lines for developers to reproduce leaks.

One artifact of this is caching - if a test causes something internally to 
cache on the first run of the test, it will show up as leaked once (in theory) 
per N runs (in this case 5 above). What we did is have after the first pass of 
the tests, we reset the leak detection. After running the rest of the 
iterations (N-1) we dump the output. Since I mentioned we do not reset a global 
state during the same run of the test, we currently are forced to end the 
process and restart it for the next test. Another reason is you cannot 
currently call ::testing::InitGoogleTest multiple times per process (at least 
in 1.4.0) ie - passing in different --gtest_filters each time to run a specific 
test N times, then reset gtest and rerun again without killing the process. 
This is rather time consuming but has produced consistent results per test 
without flagging caching that is working correctly.

Some of this is in response to what can you really do after something fails. If 
it is a crashing test and you dumped out a listing of what passed and failed, 
by doing this externally in a python script you have more control of how you 
should handle it. The xml output is another thing. I think we started examining 
the printf () output because the xml results would not always be valid if the 
process crashed. 

So I guess what I am trying to say: I do not know if we could do what we are 
doing now without controlling the process externally. This may or may not be 
due to:

1. xml output provided by gtest was unreliable depending on what happened.
2. We could not call InitGoogleTest multiple times in the same process.
3. Internally there was no safe way to recover from a crashing test.
4. Internally we have no way of knowing "this test cannot run because of a 
preexisting invalid state- need to restart the exe and clear state".

I hope this is helpful to some devs. I think controlling the exe externally 
gives you the most flexibility but takes some massaging to get exactly what you 
and your team need.

Original comment by kevin.ny...@gmail.com on 25 Jul 2011 at 2:10

GoogleCodeExporter commented 9 years ago

Hi everyone,
I tried to run the crashsafe.py file available above on my system, but I get 
the following error:

---------------------------------------------------------------------------
F:\Tree\stream_sdk\opencl\internal\OVDecodeTest>crashsafe.py
  File "F:\Tree\stream_sdk\opencl\internal\OVDecodeTest\crashsafe.py", line 184
    current_test_case = None
                           ^
TabError: inconsistent use of tabs and spaces in indentation
---------------------------------------------------------------------------
This is my first encounter with python and my timelines doesn't permit me to 
dive into learning python. So I would be very obliged if anyone can suggest a 
way to fix this. Thanks for your support.

Original comment by himanshu...@gmail.com on 12 Aug 2011 at 7:49

GoogleCodeExporter commented 9 years ago

The fixed .py file is attached.

Next time you see such an error - just replace tabs in the .py file with "  " 
(= 2 spaces) or "    " (= 4 spaces)
// depends on how many spaces per indent are used in the file

Original comment by timurrrr@google.com on 12 Aug 2011 at 8:23

Attachments:

crashsafe.py

GoogleCodeExporter commented 9 years ago

I am getting same error(see) below even after changing tab with 2 or 4 spaces.  
Really waiting for solutions.

--------------------------------------------------------------
  File "F:\users\Naganna\trees\streamsdk\opencl\internal\OVDecodeTest\OVDTest\build\debug\x86\crashsafe.py", line 184
    current_test_case = None
                           ^
TabError: inconsistent use of tabs and spaces in indentation
--------------------------------------------------------------

Original comment by naganna...@gmail.com on 29 Sep 2011 at 10:55

GoogleCodeExporter commented 9 years ago

Here is a version of with the correct whitespace.

Original comment by vladlosev on 2 Oct 2011 at 6:09

Attachments:

crashsafe.py

GoogleCodeExporter commented 9 years ago

vladlosev,

Latest one fixing whitespace. I am facing two more issues.

I got following error 

-------------------------------------------------------------
F:\users\Naganna\trees\streamsdk\dist\dashboard\samples\opencl\bin>"crashsafe 
.py"
  File "F:\users\Naganna\trees\streamsdk\dist\dashboard\samples\opencl\bin\crashsafe .py", line 309
    print '%s: Unable to obtain test list from %s' % (argv[0], binary_name)
                                                 ^
SyntaxError: invalid syntax
------------------------------------------------------------------------

After commenting 309 line and running my tests gives following error

---------------------------------------------------------------------

F:\users\Naganna\trees\streamsdk\dist\dashboard\samples\opencl\bin>crashsafe.py 
Test.exe
Traceback (most recent call last):
  File "F:\users\Naganna\trees\streamsdk\dist\dashboard\samples\opencl\bin\crashsafe.py", line 314, in <module>
    main(sys.argv)
  File "F:\users\Naganna\trees\streamsdk\dist\dashboard\samples\opencl\bin\crashsafe.py", line 303, in main
    tests = GetTestList(binary_name, gtest_filter)
  File "F:\users\Naganna\trees\streamsdk\dist\dashboard\samples\opencl\bin\crashsafe.py", line 116, in GetTestList
    for line in output.split('\n'):
TypeError: Type str doesn't support the buffer API
----------------------------------------------------------------------

Please help me to fix the problem.

Original comment by naganna...@gmail.com on 5 Oct 2011 at 5:37

GoogleCodeExporter commented 9 years ago

My python versions is as follows

Python 3.2.2 (default, Sep  4 2011, 09:51:08) [MSC v.1500 32 bit (Intel)] on 
win32

Original comment by naganna...@gmail.com on 5 Oct 2011 at 5:39

GoogleCodeExporter commented 9 years ago

Please run this script with Python 2.4 - 2.7.

Original comment by vladlosev on 5 Oct 2011 at 6:12

GoogleCodeExporter commented 9 years ago

Thank you for giving version details. I am facing new problem now after 
installing 2.6.6. 

Error is as follows

crashsafe.py: Unable to obtain test list from Test.exe

It is working fine if Text.exe is run independently.

Could you please help me to fix the problem

Original comment by naganna...@gmail.com on 5 Oct 2011 at 6:53

GoogleCodeExporter commented 9 years ago

You are most likely trying to use this script with a binary that doesn't 
contain Google Test based tests.

Original comment by vladlosev on 5 Oct 2011 at 6:01

GoogleCodeExporter commented 9 years ago

I have few google test based tests.

I am able to run Text.exe with different options without any problem.

Original comment by naganna...@gmail.com on 6 Oct 2011 at 6:24

fovecifer / googletest

need a way to resume from the TEST that crashed #311