radical-collaboration / MDFF-EnTK

MDFF-EnTK: Scalable Adaptive Protein Ensemble Refinement Integrating Flexible Fitting
0 stars 1 forks source link

Adding Nodes to resource configuration file #12

Closed benjha closed 4 years ago

benjha commented 4 years ago

Q:

Does an argument exist to specify the number of nodes in

https://github.com/radical-collaboration/MDFF-Error/blob/master/simple_mdff_cfg.yml

?

A:

It does not have argument for node counts but cpu counts. The file you pointed out contains resource at a task level which is bounded by the resource description for a job scheduler, e.g. slurm and lsf. See the another file below. Both uses CPU counts instead of node counts.

https://github.com/radical-collaboration/MDFF-Error/blob/master/resource_cfg.yml

For example, you place 56 cpus on bridges (56cpus = 2nodes; 28 each) On summit, 168 cpus indicate 1 node (2 sockets 21 cpus 4 hw threads).

For 4,8,16, … we can multiply cpu counts in those files.

mturilli commented 4 years ago

I am not sure what this ticket is about. Should we add this information to our documentation? What the answer satisfactory? Was the yml file updated and is it usable?

benjha commented 4 years ago

@mturilli Yes, please, add it to the documentation.

@lee212 if 168 CPUs is equivalent to 1 node, thus smt4 flag is enabled somewhere, right ? I am wondering what about OMP_NUM_THREADS, is it also set?

For example:

https://jsrunvisualizer.olcf.ornl.gov/?s4f1o14n1c1g0r11d1b21l0=

benjha commented 4 years ago

I noted there is a pre_exec label in workflow configuration file to export env. variables

lee212 commented 4 years ago

smt4 is default according to here: https://github.com/radical-cybertools/radical.saga/blob/db772b9feac2405de151490d22f8539ee4056ee9/src/radical/saga/adaptors/lsf/lsfjob.py#L38

OMP_NUM_THREADS is defined by a user in the entk Task level, e.g.: https://github.com/radical-collaboration/MDFF-Error/blob/736ebea7ade65705dc62f7a2bb84ddc49c73bb77/simple_mdff.summit.py#L74

Generally, we see 4 on both SMT and OMP_NUM_THREADS.

BTW, thanks for the visualizer link, it definitely helps see how params make changes.

lee212 commented 4 years ago

Yes, pre_exec is a special variable to run a list of commands prior to the main executable, in this case, NAMD for simulation. As you can see, you do export and cp/mv/cd as some of the preparation using the pre_exec.

lee212 commented 4 years ago

README has been updated to explain how to specify the number of nodes in the yaml file. see here: https://github.com/radical-collaboration/MDFF-Error#faq This can be included to the main doc later, https://radicalentk.readthedocs.io/

Is this resolved and can be closed?

benjha commented 4 years ago

Yes, thanks