Implement function(s) to find out commits leading to a significant performance decrease

analyticalmonk commented 8 years ago

Follows up on #13. Idea similar to what asv implements.

Reference: asv: finding a commit that produces a large regression

analyticalmonk commented 8 years ago

Below are two plots obtained using the function, _plotbottlenecks().

Rperform::plot_bottlenecks(test_path = "tests/testthat/test-dup.r", num_commits = 10, benchmark = 2, metric = "time", threshold = 0.1) rperform_plotbneck1

Rperform::plot_bottlenecks(test_path = "tests/testthat/test-interp.r", num_commits = 5, benchmark = 2, metric = "memory", threshold = 0.1) rperform_bneck2

The idea is to use a testfile's metric (runtime or memory) on one version (commit) to compare different versions.

Mechanism for setting a benchmark value:

The benchmark parameter decides which commit is chosen to set the benchmark; 1 for first, 2 for second and so on.
Mean of the 3 obtained metric values for the benchmark commit for each test is obtained.
The threshold parameter decides how much deviation do we allow from the benchmark. For instance, if threshold is equal to 0.1 then a metric measurement will be classified as a bottleneck only if it exceeds 1.1 times the benchmark value.

How to interpret the above plots: The horizontal line represents the threshold value, i.e. (1+threshold)*benchmark_value, for each respective test. The blue dots represent the values which exceeded this threshold, and the red ones which didn't. If say, just one among the three values for a particular commit exceeds the threshold, we can choose to neglect it. But if two or all three values exceed the threshold for a particular commit then that might indeed be a performance bottleneck!

@tdhock @joshuaulrich Any thoughts?

tdhock commented 8 years ago

its a neat idea but I would suggest using the term "outlier" rather than bottleneck.

Also there are many different ways to do outlier detection. I definitely would trust my visual interpretation of the plot more than the classification based on some arbitrary outlier detection parameters. But if I had to choose an outlier detection method, I would use an optimal change-point detection method such as changepoint::cpt.mean. That being said I think it is a bit out of scope for this GSOC project, and I would encourage you to focus on the visualizations.

analyticalmonk commented 8 years ago

That does make sense. I was trying this to implement the 'finding commits which cause regression' functionality similar to what asv has. But I realize now that it would be better to let the user perform a visual analysis by themselves. Will close this issue for now.

I will next concentrate on making the plots interactive and letting the user be more selective about the commits they want to analyze. Since currently, the functions just allow for sequential testing.

analyticalmonk / Rperform

Implement function(s) to find out commits leading to a significant performance decrease #18