encore.ces not running on multiple CPU cores

des2037 commented 6 months ago

Expected behavior

I'm trying to run Cluster Ensemble Similarity on my systems (exactly as described in the documentation, https://userguide.mdanalysis.org/stable/examples/analysis/trajectory_similarity/clustering_ensemble_similarity.html) on 4 CPU cores: ces2, details2 = encore.ces([u1, u2, u3], select='name CA', ncores=4).

I expect the calculation to run on 4 CPU cores. I've also tried just ncores=2, but I run into the same problem.

Actual behavior

The calculation only runs on 1 CPU core. In top/htop, I see 5 python processes related to my python script (provided below). 1 process has 100% CPU utilization, whereas the other 4 have 0% utilization and a state of "S". Overall, only 1 CPU core is used.

I've tried running this within a Slurm interactive session, as well as outside of a job scheduler, directly on the node. This does not make any difference. The compute node I am running on has 32 CPU cores.

Code to reproduce the behavior

import numpy as np
import MDAnalysis as mda
from MDAnalysis.analysis import encore
from MDAnalysis.analysis.encore.clustering import ClusteringMethod as clm

u1 = mda.Universe('traj1.psf', 'traj1.dcd')
u2 = mda.Universe('traj2.psf', 'traj2.dcd')
u3 = mda.Universe('traj3.psf', 'traj3.dcd')

ces2, details2 = encore.ces([u1, u2, u3], select='name CA', ncores=4)

Current version of MDAnalysis

Which version are you using? 2.7.0
Which version of Python: 3.11.8
Which operating system? Rocky Linux release 8.5 (Green Obsidian), also tried on Red Hat Enterprise Linux release 8.7 (Ootpa)

IAlibay commented 6 months ago

Thanks for raising this issue @des2037, development of encore has moved to the MDAKit mdaencore, I will migrate this issue there so that the encore developers can see it.

orbeckst commented 5 months ago

I would first recommend to install the ENCORE MDAKit separately because all development/bug fixes will be done on mdaencore.

orbeckst commented 5 months ago

@MDAnalysis/encore team, do you have any insights what might be happening here?

enoee commented 5 months ago

Is this answer relevant: https://github.com/MDAnalysis/mdaencore/issues/27#issuecomment-1776893690

orbeckst commented 5 months ago

Ah, well according to #27 parallelization only applies to multiple parameters and hence the described behavior is normal.

Perhaps the docs could be improved to make this clear, given that this is a recurring misunderstanding.

kmorisa commented 5 months ago

Hi, please excuse me to leave my comment here, as I also misunderstood that this behavior was an “issue.” Thank you for the clarification. I now see that the parallelization only works when I do multiple clusterings, not within a single clustering. Is there any plan to make a single clustering in a parallel manner? Especially for a replica exchange output, I think the analysis cannot be completed within a day or two. I will double check, but I think I tried to run it with 24 replicas (simulations) of 2000 frames of at most 600 atoms.

enoee commented 5 months ago

Hi again. Good that the situation is now more clear. I'm afraid we currently don't have any plans to add new features. I hope you can find a way to make use of the software as it is, e.g by down sampling time-correlated frames (if present)

kmorisa commented 5 months ago

Thank you for your response, and I understand.

des2037 commented 5 months ago

Thanks everyone for investigating this--good to know it wasn't just user error!

MDAnalysis / mdaencore