oar-team / batsim

Batsim: Infrastructure simulator for job and I/O scheduling
GNU Lesser General Public License v3.0
30 stars 15 forks source link

Sending `EXECUTE_JOB` with a wrong number of nodes #40

Closed adfaure closed 7 years ago

adfaure commented 7 years ago

If I send EXECUTE_JOB with fewer resources than the job needs, batsim will segfault. edit: if the number of resources is greater, batsim doesn't care.

mpoquet commented 7 years ago

It's not a bug, it's a feature :D. A buggy / not implemented yet feature at the moment :(.

Previously Batsim checked that the number of given resources exactly matched the job size. However, a SMPI job can now be run with fewer resources, which allows to place several ranks on the same machine. This code has not be tested with other profiles, I will add a test about it and debug ;).

mpoquet commented 7 years ago

I can reproduce the issue.

mpoquet commented 7 years ago

@adfaure : Can you confirm that 0659e43 solved the problem? A clear error message should now be displayed (hidden before a SimGrid backtrace).

mpoquet commented 7 years ago

Really using fewer resources than a job requested should work after 6e1e44f. To do so, a mapping must be given with the EXECUTE_JOB message.