google-research / torchsde

Differentiable SDE solvers with GPU support and efficient sensitivity analysis.
Apache License 2.0
1.51k stars 194 forks source link

Fixed BInterval test flakiness #76

Closed patrick-kidger closed 3 years ago

patrick-kidger commented 3 years ago

Fixes the flakiness by decreasing alpha.

Looking at the number of times a pval check is made, with alpha=1e-4, the probability of getting a failure is 25% (under the null hypothesis). This is pretty high - it's no wonder we're seeing flakiness.

I've not figured out the exact Bonferroni-type corrections, but reducing this to alpha=1e-5 gives a probability of getting a failure as 3% (under the null hypothesis), which is still pretty high but probably tolerable. I've usually seen genuine failures giving pvals of 1e-180 or similar, so I don't think false negatives should be a serious issue.

I've also removed setting the pool size, as I think that's unnecessarily slowing things down.