BurntSushi / quickcheck

Automated property based testing for Rust (with shrinking).
The Unlicense
2.4k stars 149 forks source link

Infinite loop detection #176

Closed Michael-F-Bryan closed 6 years ago

Michael-F-Bryan commented 7 years ago

Quickcheck seems to have found an infinite loop somewhere in my program but instead of detecting that a particular run takes significantly longer than it should, the test just times out.

For now I've worked around this by putting a print statement at the top and running tests with cargo test -- --nocapture, but it'd be nice if quickcheck had a mechanism for detecting possible infinite loops...

BurntSushi commented 7 years ago

had a mechanism for detecting possible infinite loops...

I don't see how that's possible.

Michael-F-Bryan commented 7 years ago

Is there a way to add a timeout and then fail a test (maybe by throwing a panic) if it takes longer than a "reasonable" amount of time? e.g. 5 minutes.

You might then be able to make it configurable by adding some sort of timeout = 5m entry to the quickcheck!() macro.

BurntSushi commented 7 years ago

Is there a way to add a timeout and then fail a test (maybe by throwing a panic) if it takes longer than a "reasonable" amount of time? e.g. 5 minutes.

Again: I don't see how that's possible. Can you describe the implementation path you're thinking?

The only thing I can think of is if quickcheck ran every single test in a distinct process, then it could kill the process, but there are so many issues with that path that I can easily reject it out of hand.

Michael-F-Bryan commented 7 years ago

I'm not 100% sure. One way I can think of might be to use a pair of futures where one runs the test on a background thread and the other is a simple timeout. Then you could select on the two and if the first future to finish is the timeout you know the test ran for too long and can say the test failed.

Or is that going to get a bit too complicated?

BurntSushi commented 7 years ago

That doesn't stop the other thread from running. And it implies running every test in a separate thread, which is not the case today.

I think you should search for other solutions. I actually think printf debugging is a fine solution. What don't you like about it?

Michael-F-Bryan commented 7 years ago

Debugging with print statements is fine when you've got a human at the keyboard, but I imagine something like travis or circle-ci would blindly keep running the test until some global timeout occurs. That said, when I was running the tests on my laptop it was easy to see I'd found an infinite loop and then add the print statements to figure out what the problem was. Likewise if it was running on some ci service you'd see the failed build due to a timeout and be able to investigate. So I guess in that sense quickcheck did its job of finding poorly handled edge cases perfectly.

I know there's a way to hook up signals and alarms in python (example), however implementing that in Rust in a cross-platform, multithreaded way would probably be extremely hacky, if it's possible at all.

BurntSushi commented 7 years ago

The only way to interrupt running code is to quit the process. Technically, there may be platform specific ways to kill individual threads forcefully, but they are pretty universally shunned as bad juju.

Popping up a level, why even target quickcheck for this? Why not unit tests in general? You could have an infinite loop anywhere, and that unit test won't time out even if quickcheck added this functionality.

AFAIK, there are no sufficiently general solutions to this problem that I would be happy with adding to quickcheck. The simplest thing would be something like you said (although you don't need futures, channels would work), but it would necessarily have to quit the process and would probably entail some overhead for each test that runs. (This morally equivalent to the Python code you linked.)

If your CI will time out, then how is that any different than what quickcheck could do?

Michael-F-Bryan commented 7 years ago

I was thinking quickcheck might be able to tell you which set of inputs can lead to significantly longer run times than usual, and then be able to narrow down those inputs to the smallest example. So it could extend the existing error checking to also check for cases with pathological run times.

In my case, quickcheck managed to find a set of inputs which I never would have thought to try when writing unit tests. It found a funny bug in my parser where it would peek at a token instead of popping, so the parser would continually think there was 1 token left in the stream and never finish.

The more I search on the internet for ways of detecting long run times and then killing the test prematurely (then doing the usual reducing to the minimal case), the less it sounds like such a feature would be possible. I know Linux has [pthread_kill()][pt] but like you said, using something like that is probably frowned upon and not something you'd be happy including as a feature in quickcheck. If it even works at all.

It sounds like the best way of finding edge conditions which seem to go into infinite loops would be to wait until quickcheck discovers them and then investigate manually. As such, adding a timeout feature or detecting unusually long run times for a particular set of inputs is probably outside the scope of this crate.