Orchestration attempts at inlining everything by default - and in the case of scalar that means freezing some information (like loop sizes for examples or n/k split). The grid is passed mostly by self.X of the submodule, which will lead to it being frozen.
Question
Is there any rank specific data that could lead to error in the model when we compile on top tile and send the results across all ranks?
E.g. are we using top-tile to compile all tile, is it reproducing the data properly because they are not lat/lon depend (or just in nature), e.g. area, rarea, or is there an actual dependency that could lead to small errors (cubtolatlon?)
The code that does distributed compile for DaCe has been deactivated in the meantime. See DEACTIVATE_DISTRIBUTED_DACE_COMPILE
Solutions
Move grid & init to a be a parameters on the dyn call and flag it dace.compiletime
Make sure that no data frozen is in fact different on non top-tile rank
DODDEACTIVATE_DISTRIBUTED_DACE_COMPILE has been removed and underlying problem solved or documented
Orchestration attempts at inlining everything by default - and in the case of scalar that means freezing some information (like loop sizes for examples or n/k split). The grid is passed mostly by
self.X
of the submodule, which will lead to it being frozen.Question
Is there any rank specific data that could lead to error in the model when we compile on top tile and send the results across all ranks?
E.g. are we using top-tile to compile all tile, is it reproducing the data properly because they are not lat/lon depend (or just in nature), e.g.
area
,rarea
, or is there an actual dependency that could lead to small errors (cubtolatlon
?)The code that does distributed compile for DaCe has been deactivated in the meantime. See
DEACTIVATE_DISTRIBUTED_DACE_COMPILE
Solutions
dyn
call and flag itdace.compiletime
DOD
DEACTIVATE_DISTRIBUTED_DACE_COMPILE
has been removed and underlying problem solved or documented