Closed mpoquet closed 7 years ago
smpi_replay_run seems to be blocking.
We should just synchronise two batsim processes: smpi_replay_process and its calling process (called through execute_profile).
Trouble comes from calls of MSG_process_create_with_arguments which which call smpi_replay_process. A solution is to add a semaphore between MSG_process execute_profile and one of MSG_process_process_create_with_arguments. Added in https://github.com/oar-team/batsim/commit/4e8b3cef41619776225513d9a5d86a7f2f5fa4d6
Simulation time inconsistency fixed by 4e8b3ce.
Closing this issue, but SMPI still does NOT work in Batsim (see issue #13).
Abstract
Something is wrong about the management of SMPI jobs in Batsim. When long jobs are executed, Batsim states they are finished way before they actually are.
How to reproduce ?
Versions
Batsim version : b10cb66371bc8dbfd (master branch) SimGrid version : oar-team/simgrid-batsim, batsim-compatible, 42a5c2c5fa27026391c
Step 1 (necessary to generate some files)
Execute Batsim command (in one terminal)
Execute sched command (in another terminal)
Expected results
The makespan should be 15.00039 whereas the simulation finishes at 20621958276.156872 !
Batsim output
Scheduler output