NCAR / mpibind

MPI binding utilities
MIT License
1 stars 1 forks source link

mpibind.log files need to be unique per job #4

Closed jedwards4b closed 1 week ago

jedwards4b commented 9 months ago

The mpibind log written here and here.. Needs to include the PBS_JOBID so that it is unique when a user is running multiple jobs, further the rm command here is too general.
Another solution might be to create a subdirectory of TMPDIR based on jobid.

benkirk commented 9 months ago

I've deployed a fix using mktemp and it is available on Derecho.

Will push the git commit here shortly.

sjsprecious commented 9 months ago

Thanks Ben for working on this issue. According to Francis, the updated mpibind script now works well for multiple jobs that run simultaneously on Derecho. However, there is a failure coming from a job that uses only 36 CPU cores per node. Is this failure related to the mpibind script or other system configurations?

roryck commented 9 months ago

Hi Jian,

Thanks for letting us know that Ben's fix worked for the multiple jobs issue.

On the issue with 36 CPU core job failing - this is sort of a PBS cgroup / system issue. When you select ncpus < 128, PBS creates a cgroup, and doesn't do so in a way that's balanced across sockets, so it will over subscribe one CPU and artificially limit your memory bandwidth. So, all PBS jobs should set ncpus=128, even if they are using less than 128 cores. The mpibind scripts will bind correctly and balance across sockets in this scenario. I had argues for ncpus=128 being the default and not having to set it in a PBS job at all, but was outvoted on that issue. I could have mpibind error out with a message when it detects ncpus < 128, but that's about all I could do in the wrapper. Would that be helpful?

jedwards4b commented 9 months ago

Hi Rory,

Yes I think that you should have it error out - I'll need to make a change in cime to get cases run on less that a node to set up this way.

sjsprecious commented 9 months ago

Hi Rory and Jim,

Thanks for your quick and detailed replies. That makes sense to me.

If Jim is going to make changes in CIME to get a job with < 128 CPU cores work on Derecho, shall we let mpibind issue a warning rather than an error so that the simulation can proceed? And I guess what we want to do is always setting ncpus=128 for PBS resources but passing the actual requested number of CPU cores to the mpiexec command through mpibind?

roryck commented 1 week ago

Done