Parallelisation of the runs in the parameter study fails because the ensemble_results of the individual ranks are all sent at once due to an Overflow of the Int value bytes_size.
The maximum capacity is quickly reached if the flows are also to be saved. In my case, I use the Secir model with tmax = 250.
Just the flows have a size of 72mb per run (dim 400 [# Counties] * 250[# Days] * 15[# Flows] * 6 [# Age groups] * 8[size double])
Version
Linux
To reproduce
Save the flows + results in the results processing function and do 150 runs with tmax=250.
Relevant log output
[sc-030233l:880257] * An error occurred in MPI_Send
[sc-030233l:880257] * reported by process [1905983489,9]
[sc-030233l:880257] * on communicator MPI_COMM_WORLD
[sc-030233l:880257] * MPI_ERR_COUNT: invalid count argument
[sc-030233l:880257] * MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[sc-030233l:880257] * and potentially your MPI job)
Add any relevant information, e.g. used compiler, screenshots.
No response
Checklist
[X] Attached labels, especially loc:: or model:: labels.
Bug description
Parallelisation of the runs in the parameter study fails because the ensemble_results of the individual ranks are all sent at once due to an Overflow of the Int value
bytes_size
. The maximum capacity is quickly reached if the flows are also to be saved. In my case, I use the Secir model withtmax = 250
. Just the flows have a size of 72mb per run(dim 400 [# Counties] * 250[# Days] * 15[# Flows] * 6 [# Age groups] * 8[size double])
Version
Linux
To reproduce
Save the flows + results in the results processing function and do 150 runs with tmax=250.
Relevant log output
Add any relevant information, e.g. used compiler, screenshots.
No response
Checklist