The Information Dynamics Toolkit xl (IDTxl) is a comprehensive software package for efficient inference of networks and their node dynamics from multivariate time series data using information theory.
The previous implementation of stat._perform_fdr_correction exhibits significant differences to the results of statsmodules.fdr_correction.
Upon closer inspection of the code i’ve identified the problem: It’s the lines 309ff in stats.py:
if np.invert(sign).any():
first_false = np.where(np.invert(sign))[0][0]
sign[first_false:] = False # avoids false positives due to equal pvals
Here, IDTxl searches the index of the first non-significant pval in the sorted array of pvals and sets everything afterwards to non-significant, e.g., 110010 -> 110000 for an ordered array of significance values where 1 is significant and 0 is not.
This is at odds with how the Benjamini-Hochburg and Benjamini-Yakutelli correction procedures are designed (see https://en.wikipedia.org/wiki/False_discovery_rate): Instead, one should find the last significant pval and set every test before that to significant, i.e.
if sign.any():
signmax = max(np.nonzero(sign)[0])
sign[:signmax] = True
where this code is taken from statsmodules (slightly renamed). For the same example as above this yields 110010 -> 111110.
This branch solves this issue by instead using the correct implementation of statsmodels directly.
The previous implementation of stat._perform_fdr_correction exhibits significant differences to the results of statsmodules.fdr_correction. Upon closer inspection of the code i’ve identified the problem: It’s the lines 309ff in stats.py:
if np.invert(sign).any(): first_false = np.where(np.invert(sign))[0][0] sign[first_false:] = False # avoids false positives due to equal pvals
Here, IDTxl searches the index of the first non-significant pval in the sorted array of pvals and sets everything afterwards to non-significant, e.g., 110010 -> 110000 for an ordered array of significance values where 1 is significant and 0 is not. This is at odds with how the Benjamini-Hochburg and Benjamini-Yakutelli correction procedures are designed (see https://en.wikipedia.org/wiki/False_discovery_rate): Instead, one should find the last significant pval and set every test before that to significant, i.e.
if sign.any(): signmax = max(np.nonzero(sign)[0]) sign[:signmax] = True
where this code is taken from statsmodules (slightly renamed). For the same example as above this yields 110010 -> 111110.
This branch solves this issue by instead using the correct implementation of statsmodels directly.