Regression Test Selection in Ruby

Introduction

For a serious project, before we merge code into main branch, we need run all the tests in the project.

As the project grows, it will have more and more tests and the testing time will increase from seconds to minutes, even hours.

And most time people will integrate the tests running phrase in CI and run them in parallel.

In my team, we use 45 instances to run the tests for achieving a reasonable time.

As we have too many test cases, we can't run them during development, so we have to submit code and rely on the slow CI.

Such process is slow and disturbing.

Most pull request, will only affect some of test cases, regression test selection(RTS) techs can help us to select such test cases.

In this article, I will analysis our codebase briefly and introduce some RTS methods. Then I will provide a possible solution for our use case.

Do we need to run all tests?

We have so many tests and it's take too much time to run in local. And I just to want get a rough picture.

So I select run all tests in spec/requests directory.

Ruby provides Coverage#peek_result to collect coverage info. And we can run it before and after a test case,

then the difference of the two results is the affected code. The detail is described in @tenderlove's article[1] and the code is here[2] which is based on crystalball[3].

After collected the coverage info, I checked some recent pr to see how many test cases they really affect.

The result as below:

From the result, we can see, most pr will not edit a lot of files and most pr will not affected many tests.

But they are some really popular file, such as authorize_helper.rb, it will affect 1011 specs, it is 21.9% of the total request test cases.

For all files, we can find 90% files affect less 1% requests tests, p95 is 3% and p99 is 10.1%.

So most PR will affect only a small set of all the test cases.

RTS Techniques

Slicing Technique

Agrawal et al. introduced a family of test case selection techniques based on different program slicing techniques[4].

And they are based on simple observations:

If a statement is not executed under a test case, it can not affect the program output for the test case.

@tenderlov's Predicting Test Failures[1] uses such tech.

But if the control flow graph of the program is altered, the technique is not safe.

For example, we a bar function,

def bar
   'bar'
end

and then we change it to

def bar
  'bar'
  'bar2'
end

It will affect result obviously, but it will not be detected. Agrawal et al. provides additionl method to handle such case,

such as detecting the added statement's affect variab or finding control-dependent for a predicate statement[4][6].

An Unsafe Technique from Google

The Firewall Technique

Future Works and Additional Thing

Find a fast way to get executing info

Report affected API/features to QA team

QA team will do regression test regularly.

If we can provide affected API to QA that should be helpful.

And recently, our team is working on adding feature tag to API.

After we finished the tagging, we can match API to feature and we can provide affected features to QA too.

References

Predicting Test Failures https://tenderlovemaking.com/2015/02/13/predicting-test-failues.html
rts_rb https://github.com/yfractal/rts_rb
Crystalball https://github.com/toptal/crystalball
Hiralal Agrawal. Incremental Regression Testing, 1993
Gregg Rothermel. Analyzing Regression Test Selection Techniques, 1996
S. Yoo, M Harman. Regression Testing Minimisation, Selection and Prioritisation: A Survey, 2007

yfractal / blog

[draft] Regression Test Selection in Ruby #9