Closed juliasloan25 closed 6 months ago
Coupler output table shows very similar allocations between atmos-only and coupled simulations, as of 5/1 (on GPU): coupled simulation allocations: 3.361 GiB atmos-only simulation allocations: 3.255 GiB
(on CPU): coupled CoupledSimulation object allocations: 0.196 GiB atmos-only CoupledSimulation object allocations: 0.195 GiB
When we try to run the DYAMOND configuration on central's P100 GPUs, it fails because there isn't enough memory available during the
atmos_init
call. The same run works fine on clima's A100 GPUs, but inatmos_init
we seeEffective GPU memory usage: 87.32% (69.114 GiB/79.150 GiB)
. 70GB memory usage is a lot, so we need to look into where these allocations are coming from.We can do this by placing
CUDA.memory_status
calls throughout the code to see where the allocations jump