rjust / defects4j

A Database of Real Faults and an Experimental Infrastructure to Enable Controlled Experiments in Software Engineering Research
MIT License
694 stars 298 forks source link

Is it possible to include more than one or all the active bugs in one version of a project #413

Open ekincanufuktepe opened 3 years ago

ekincanufuktepe commented 3 years ago

I am not sure if my attempt makes sense or not, but I want to include all the active bugs in one single version of a project. The reason why I want to do this is that I want to do a regression test case prioritization based on changes that are made in the project.

However, from my past experience with mutation testing, when mutants are isolated some test cases kill these mutants. But, when the mutants are given all at once, some of the past killed tests remain alive. Therefore, I am afraid that I might experience the same thing here, which I believe is very likely.

So, I wanted to make sure if I can introduce all the bugs in one project, and if so how can I do it, because from my understanding all the bugs are isolated. Then, if this is possible, my next following question will be, what will be the base project (the previous version before performing regression testing. In other words the version without the active bugs).

Any guide would be helpful, and any suggestion is more than welcome!

rjust commented 3 years ago

Hi @ekincanufuktepe,

You probably can inject multiple active bugs in a single version of the project, but probably not all of them. I suspect that injecting multiple bugs into the most recent version is your best bet (e.g., injecting bugs Lang-2, Lang-3, etc. into the source code of Lang-1; make sure to check the timestamps of the bugs to identify the newest version). Since the bugs span multiple years of development, you may need to account for changes in the code base when applying a bug-inducing patch to a version other than the one it was generated from.

Your question about the base project version is a good one. Since the injection of multiple bugs into a single version will create an artificial version of the program, I don't see how you could easily create a meaningful base version that somehow mimics the actual evolution of the project.

Depending on your research question and setup, it may be sufficient to evaluate your test-case prioritization technique on each bug in isolation. Maybe you can measure success as the probability of detecting each bug and then aggregate the results.

Let me know if I misunderstood your question.

Best, René

jose commented 3 years ago

Hi @ekincanufuktepe,

Quick heads up, @djpaterson developed (in here) a combine command which allows one to "combine faults in a particular project version". Here is the set of faults that can be syntactically combined in a single version.

The combine command was developed in the context of the paper Using controlled numbers of real faults and mutants to empirically evaluate coverage-based test case prioritization, which sounds related to what you're studying.

-- Best, Jose

ekincanufuktepe commented 2 years ago

Hi @rjust and @jose

Thank you for the valuable information!

@rjust,

I think I found a paper titled An Empirical Study on the Use of Defect Prediction for Test Case Prioritization, which I believe @jose is one of the authors. They have also mentioned the difficulties of using APFD (Average Percentage of Faults Detected -a metric for measuring testing effectiveness of the test case prioritization approach-) with Defects4J. Therefore, they did something similar to what you mentioned on isolating bugs, and measured the test order effectiveness on each bug independent from each other. However, I have a following question, will it make sense if I use the parent commit as a base project to run my change-based test case prioritization technique. I mean it will definitely not be a test case prioritization technique for regression testing, but what if I want to do it for regression testing? The best approach that appears to me is selecting two stable versions for the project with their timestamps, and make sure the buggy commit date is between those two timestamps, is there an easier or smarter way to do this, or is information already provided in Defects4J?

@jose

This is definitely useful, thank you for sharing the paper and the repo! It is great multiple faults can be combined in a single version. The only thing I couldn't find in the repo is which commit or versions are used as a base project (bug free version), where they combine the faults. I am pretty sure they do have a base project to combine the faults.

Thanks a lot for the advice and guide!

Ekin