piernik-dev / piernik

Piernik MHD Code
GNU General Public License v3.0
15 stars 15 forks source link

timestep: override repeat_step to false in simulations without repetitve_steps #492

Closed mogrodnik closed 2 years ago

mogrodnik commented 2 years ago

This fixes problems (hanging up) occuring on some larger simulations in which repetitive_steps are not allowed. Due to possible CFL violation 'repeat_step' flag was being triggered and in setups without repetitive steps, left unattended, this later causes the simulation to stall during an attempt to reduce chspeed (hdc:update_chspeed).

gawrysz commented 2 years ago

The problem with deadlocks in unrelated routines comes from repeat_step not reduced globally before use. I guess that check_cfl_violation routine also needs a fix. And the declaration of repeat_step needs an elaborate comment with warning :)

mogrodnik commented 2 years ago

I suppose that deadlocks in this case might have more than one cause and how repeat_step is handled is just one way to make them happen. Well, anyway - the first time the deadlocks started to occur on our machines was after 4e9aed1cbf07a25ba5f9c1080aa6c802eaff68f9; i found it later, when I started to look over the network.

gawrysz commented 2 years ago

In 4e9aed1 call piernik_MPI_Bcast(repeat_step) became conditional, which allowed having repeat_step unreduced.

I just think that handling repeat_step requires very defensive approach to avoid strange deadlocks in the future. Even if it will cost an extra synchronisaton call per timestep.