Closed climbfuji closed 7 years ago
@ldfowler58 Do you have any comments on these changes?
@climbfuji Out of curiosity, were these race conditions picked up when running the 'mesoscale_reference' suite? If so, is it easy enough to try with the 'convection_permitting' suite? I ask, because with the changes in this PR, I can get reproducible results with the former but not the latter (neither of which work before the changes proposed here); I'm also adding -fp-model precise -qno-opt-dynamic-align
to the compiler flags. Perhaps I should learn how to use the Intel Inspector.
@mgduda In my case, I got bfb identical results without these changes with the mesoscale_reference suite. By swapping parameterizations one by one in order to get from mesoscale_reference to convection_permitting, I got bfb identical results for all combinations except with the Thompson MP scheme. Thus, I used Intel Inspector with the convection_permitting suite.
I agree with the issue related to re-initializing the variables has_reqc, has_reqi, ... Because I do not use OMP, I cannot comment on the block around lines 720 or 880 and trust the changes made by Dom and Michael.
This PR fixes two (identical) race conditions in both mpas_atmphys_driver_radiation_lw.F andmpas_atmphys_driver_radiation_sw.F. These race conditions showed up when running Intel Inspector.
The first block around lines 430 is uncritical, since three variables are set (repeatedly) to zero in a scenario where they will never take any other value, at least in the current version of the code. However, they do pollute the output of Intel Inspector or other tools to detect race conditions, and they might lead to mistakes further down the road in case someone tries to implement new features and follows existing code.
The second block around lines 720 or 880 can be serious, since the calculation of starts with setting both declin and solcon to zero in radconst before assigning correct values. This means that a thread arriving later at the call to radconst might set those variables to zero, while another thread might have already progressed into rrtmg_swrad / rrtmg_lwrad. In the worst case, this thread might read a value of zero during its calculations. Wrapping the calls to radconst in OMP_SINGLE constructs (which have implicit barriers at the end, unlike OMP MASTER) avoids this race condition.