raphaelvallat / pingouin

Statistical package in Python based on Pandas
https://pingouin-stats.org/
GNU General Public License v3.0
1.65k stars 140 forks source link

FloatingPointError: divide by zero encountered in double_scalars when calculating t test #412

Closed jankaWIS closed 6 months ago

jankaWIS commented 8 months ago

Hi,

I'm encountering an issue when performing a t-test with pingouin. The problem is that in some cases, the Bayes factors cannot be computed and the code crashes, which is not very useful. Would it be possible to somehow just return a warning or nan or something that these could not be calculated instead? I'm attaching the code together with what scipy returns.

import scipy
import pingouin as pg
import numpy as np

# define data
random_group1 = np.random.default_rng(0).normal(0.618, 0.042258, 1000)
random_group2 = np.random.default_rng(1).normal(0.528, 0.04243779, 1000)

# perform tests
print(scipy.stats.ttest_ind(random_group1, random_group2, nan_policy='omit'))

print(pg.ttest(random_group1, random_group2, paired=False))

and this returns the following issue:

Ttest_indResult(statistic=48.538286790006495, pvalue=0.0)

---------------------------------------------------------------------------
FloatingPointError                        Traceback (most recent call last)
/var/folders/bx/tb4883l53hdd3zp2y0nyy_4m0000gp/T/ipykernel_93020/4147835011.py in <module>
      4 print(scipy.stats.ttest_ind(random_group1, random_group2, nan_policy='omit'))
      5 
----> 6 pg.ttest(random_group1, random_group2, paired=False)

~/anaconda3/lib/python3.8/site-packages/pingouin/parametric.py in ttest(x, y, paired, alternative, correction, r, confidence)
    294 
    295     # Bayes factor
--> 296     bf = bayesfactor_ttest(tval, nx, ny, paired=paired, alternative=alternative, r=r)
    297 
    298     # Create output dictionnary

~/anaconda3/lib/python3.8/site-packages/pingouin/bayesian.py in bayesfactor_ttest(t, nx, ny, paired, alternative, r)
    144     # JZS Bayes factor calculation: eq. 1 in Rouder et al. (2009)
    145     integr = quad(fun, 0, np.inf, args=(t, n, r, df))[0]
--> 146     bf10 = 1 / ((1 + t**2 / df)**(-(df + 1) / 2) / integr)
    147 
    148     # Tail

FloatingPointError: divide by zero encountered in double_scalars

Thanks.

raphaelvallat commented 8 months ago

Hi,

Thanks for opening the issue. I'm only getting a RuntimeWarning and not an error when trying to reproduce the example. But either way, this should be fixed by https://github.com/raphaelvallat/pingouin/pull/415