We currently provide ttest analysis tools for computing differences between runs, however there are some statistical assumptions that need to be satisfied when using the ttest - in particular, the distribute of sample means for the sets of samples drawn must be normal.
A primary use case for Sailfish is to detect network response time regressions -- something that isn't appropriately modelled using normal distributions (a Gamma, or Erlang distribution is probably a better model for this).
Therefore, a better test to use that doesn't impose any assumptions of normality may instead by the MannWhitneyWilcoxonTest test. This is implemented in Accord, so adding this (and making it the default test) will be a very straightforward change.
We currently provide ttest analysis tools for computing differences between runs, however there are some statistical assumptions that need to be satisfied when using the ttest - in particular, the distribute of sample means for the sets of samples drawn must be normal.
A primary use case for Sailfish is to detect network response time regressions -- something that isn't appropriately modelled using normal distributions (a Gamma, or Erlang distribution is probably a better model for this).
Therefore, a better test to use that doesn't impose any assumptions of normality may instead by the
MannWhitneyWilcoxonTest
test. This is implemented in Accord, so adding this (and making it the default test) will be a very straightforward change.Docs on the Wilcoxon test: http://accord-framework.net/docs/html/T_Accord_Statistics_Testing_MannWhitneyWilcoxonTest.htm