Closed DizietAsahi closed 4 years ago
Hi @DizietAsahi , thanks for the suggestion. I'll see what refactoring needs to be done, otherwise happy to accept a PR from you.
Ideally, the Lq-RT tests should utilise efficient bootstrapping (that is used here); this performance deficit suggests they don't. I wonder if it might be worth refactoring from the original lqrt
package such that the test results are derived from the bootstraps already generated as part of other calculations....
Looping @adam2392 in.
For v0.2.9, I will implement approximate permutation tests (aka Monte Carlo permutation tests) as the default reported test when dabest_object.mean_diff
is called.
I agree with @DizietAsahi that with high Ns, or with several pairwise comparisons, the Lq-RT Python implement bogs down the computational time unnecessarily. (As seen in the above example, there is a 2x increase in computational time. With 6 pairwise comparisons, there is a 10x increase.)
Therefore, I am strongly inclined to removing the Lq-RT feature in v0.2.9 because:
lqrt
code, namely because I don't fully comprehend the algorithmLq-RT tests can still be used together with the dabest
package, with a little bit of custom code, to output the desired statistics in parallel. @adam2392, feel free to DM me if you need help on that.
Looping @adam2392 in.
For v0.2.9, I will implement approximate permutation tests (aka Monte Carlo permutation tests) as the default reported test when
dabest_object.mean_diff
is called.I agree with @DizietAsahi that with high Ns, or with several pairwise comparisons, the Lq-RT Python implement bogs down the computational time unnecessarily. (As seen in the above example, there is a 2x increase in computational time. With 6 pairwise comparisons, there is a 10x increase.)
Therefore, I am strongly inclined to removing the Lq-RT feature in v0.2.9 because:
Okay that sounds okay especially given the fact that there most likely can be some optimizations to be done.
Lq-RT tests can still be used together with the
dabest
package, with a little bit of custom code, to output the desired statistics in parallel. @adam2392, feel free to DM me if you need help on that.
Yes this would be helpful. Can you show me how to do that?
I would imagine perhaps we can even build an API interface for adding any additional "hypothesis test" via a lambda function?
I can help improve the lqrt
testing perhaps in version 0.3.x+? I am a bit strapped down right now.
I do think more robust statistics down the line would be helpful to everyone since the package is so nice to use :)
@adam2392 , I think a utility function can be bundled in for v0.2.9, stay tuned! Closing this for now.
I updated to v.0.2.8 today, and I noticed my code to be much slower than before. This seem to be related to the inclusion of the
lqrt
test in the results.np.random.seed(1234) df = pd.DataFrame({'Group1':np.random.normal(loc=0, size=(1000,)), 'Group2':np.random.normal(loc=1, size=(1000,))}) test = dabest.load(df, idx=['Group1','Group2']) %time print(test.mean_diff)
import numpy as np import pandas as pd import dabest
np.random.seed(1234) df = pd.DataFrame({'Group1':np.random.normal(loc=0, size=(1000,)), 'Group2':np.random.normal(loc=1, size=(1000,))}) test = dabest.load(df, idx=['Group1','Group2']) %time print(test.mean_diff)