owkin / PyDESeq2

A Python implementation of the DESeq2 pipeline for bulk RNA-seq DEA.
https://pydeseq2.readthedocs.io/en/latest/
MIT License
573 stars 60 forks source link

Setting different n_cpus in DefaultInference gives results in the same time #288

Closed yihming closed 3 months ago

yihming commented 3 months ago

Describe the bug I tried setting different n_cpus value for DefaultInference (e.g. 1 and 8):

from pydeseq2.default_inference import DefaultInference
inference = DefaultInference(n_cpus=n_cpus)

Then dds.deseq2() and DeseqStats.summary() give the similar time (~32s) to finish for the same data.

To Reproduce Provide snippets of code and steps on how to reproduce the behavior. Please also specify the version you are using.

Expected behavior

I wonder if I didn't set the parameter correctly in order to speed up the computation by parallel computing. Or if I did set it correctly, why the calculation did not take effects. Thanks!

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

abearab commented 3 months ago

I can confirm I also experienced the same issue.

umarteauowkin commented 3 months ago

Hi ! Thanks for reporting this. Could you tell me the number of genes with which you experience this issue ? Could you set the joblib_verbosity parameter to 10 instead of the default value 0, to see how many processes are launched ?

yihming commented 3 months ago

Hi @umarteauowkin ,

Thanks for checking this issue. I tried your suggestion, and found that the Inference object's n_cpus is always overwritten to use all available vCPUs.

I tried to fix the issue in PR #293 . It works at my side. Please let me know if I need to do anything further. Thanks!

umarteauowkin commented 3 months ago

Solved by #293. Thanks @yihming !