mgbckr / corals-lib-python

11 stars 2 forks source link

Mapping p-values and corrected p-values to correlation matrix and permutation analyses #4

Open SantosRAC opened 6 months ago

SantosRAC commented 6 months ago

Dear @mgbckr ,

I used corals to compute correlations using a pandas dataframe as input, which currently generates a 19k v 19k correlation matrix. It worked really fast and fine with the cor_full method. Next, I wanted to go ahead and compute pvalues and corrected pvalues.

The example on your Github work pretty well with the Spearman test (it generates the two numpy arrays):

from corals.correlation.topk.default import cor_topk
cor_topk_values, cor_topk_coo = cor_topk(concatenated_transposed, correlation_type="spearman", k=0.001, n_jobs=8)

from corals.correlation.utils import derive_pvalues, multiple_test_correction
n_samples = concatenated_transposed.shape[0]
n_features = concatenated_transposed.shape[1]

# calculate p-values
pvalues = derive_pvalues(cor_topk_values, n_samples)

# multiple hypothesis correction
pvalues_corrected = multiple_test_correction(pvalues, n_features, method="fdr_bh")

Regarding the arrays resulting from derive_pvalues and multiple_test_correction, how should one map values back to the correlation matrix?

Finally, I was wondering if permutation analyses would help (and computationally feasible) to compute significance of correlations in corals.

Thanks a lot.