scverse / pertpy

Perturbation Analysis in the scverse ecosystem.
https://pertpy.readthedocs.io/en/latest/
MIT License
92 stars 19 forks source link

Parallelize DE methods that support it #613

Open grst opened 1 month ago

grst commented 1 month ago

Description of feature

Some of the methods are embarrassingly parallel, e.g. statsmodels, wilcoxon test.

I suggest to use the following snippet from scirpy:

https://github.com/scverse/scirpy/blob/443e59e6245b917e87972f87df350ae4f429d011/src/scirpy/util/__init__.py#L567-L579

def _parallelize_with_joblib(delayed_objects, *, total=None, **kwargs):
    """Wrapper around joblib.Parallel that shows a progressbar if the backend supports it.

    Progressbar solution from https://stackoverflow.com/a/76726101/2340703
    """
    try:
        return tqdm(Parallel(return_as="generator", **kwargs)(delayed_objects), total=total)
    except ValueError:
        logging.info(
            "Backend doesn't support return_as='generator'. No progress bar will be shown. "
            "Consider setting verbosity in joblib.parallel_config"
        )
        return Parallel(return_as="list", **kwargs)(delayed_objects)

https://github.com/scverse/scirpy/blob/443e59e6245b917e87972f87df350ae4f429d011/src/scirpy/ir_dist/metrics.py#L231-L233

block_results = _parallelize_with_joblib(
         (joblib.delayed(self._compute_block)(*block) for block in blocks), total=len(blocks), n_jobs=self.n_jobs
)

Migrated from https://github.com/scverse/multi-condition-comparisions/issues/16