Open davidbenncsiro opened 3 years ago
Debugging eventually revealed the cudaMemcpy()
point of failure in the code for the cases where flow=1
or swave=1
. In the case where both are set to 1, the “Model crashed” exit still occurs when dt
goes to zero, but for just flow=1
or swave=1
, the "offending" cudaMemcpy()
is on line 852 in Wave_gpu.cu
:
CUDA_CHECK(cudaMemcpy(OutputVarMapCPU[Param.outvars[ivar]],
OutputVarMapGPU[Param.outvars[ivar]],
OutputVarMaplen[Param.outvars[ivar]] * sizeof(DECNUM), …
in a loop relating to output variables.
For one of the variables being output (E
), OutputVarMapGPU[Param.outvars[ivar]]
, is zero.
I added a conditional check for this value (OutputVarMapGPU[Param.outvars[ivar]]
) being zero, in which case I skip the cudaMemcpy()
, allowing all other variables to be output and the simulation to complete.
A run with flow=1
was quick to complete but the output didn't look great. I'm running with swave=1
and it seems more reasonable.
Even for a partial run, zb
looks the same as XBeach, except for colours.
Will keep you posted on output results and further debug re: E
.
We can talk more details next week as well if you like.
Hum... the E variable is a bit of a special one because it is allocated and freed in the wave step. so it is not a valid pointer if the wave loop is not running and it may be a ghost pointer if it works after the wave step. I't a bit of an ancillary output and I'm not sure I ever output it (H = E8/(rhog) so I output H).
I might need to remove it from the output list or directly allocate it once for all. This was done when GPUs add 32Mb of RAM but now memory is cheap and it would remove the overhead of reallocating it every step...
I have made a new branch called CheapMem where I moved all the mem allocation to the main function and ran a quick test that seems to work.
Thanks Cyp. Will try this out.
@CyprienBosserelle should I switch to the CheapMem
branch yet? I seem to recall you saying on Wed that you were not convinced it had fixed the E
problem.
Hi @CyprienBosserelle, as per today's chat, please find attached the params in question with nx and ny changed as per discussion and flow set to 1.