AMReX-Codes / amrex

AMReX: Software Framework for Block Structured AMR
https://amrex-codes.github.io/amrex
Other
536 stars 343 forks source link

Use the host memory instead of device memory inside VisMF::Write() #3900

Closed pkufourier closed 5 months ago

pkufourier commented 5 months ago

Recently I have used AMReX to develop some programs running on GPU with cuda, and the mesh size are given as large as possible (>95% of the GPU memory). Everything goes perfect. However, when the program called the VisMF::Write() function to write checkpoint (in fact it is the AmrCoreAdv::WriteCheckpointFile() copied from the tutorial), the program crashed and said memory out. Obviously the VisMF::Write() function should have allocated additional memory from GPU and caused this problem.

My temporal solution is to use the MultiFab's clear() function to free some MF arrays before calling the WriteCheckpointFile(). After the writing has finished, re-define these MF arrays. However it seems to be somehow a little silly. Is it possible to force the VisMF::Write() use purely the host memory instead of allocating memory from GPU, since host memory is usually much larger? And the copy data to host memory is unavoidable before writing data to hard disk, I think this improvement is worth doing.

WeiqunZhang commented 5 months ago

VisMF::Write does not allocate additional device memory. https://github.com/AMReX-Codes/amrex/blob/b752027c1aebdfb4be339b1e30932b4108286a7a/Src/Base/AMReX_VisMF.cpp#L1054

Maybe it's just the memory usage is so close to the maximum that it will run out of memory even if VisMF::Write is not called.