Open Rohit-Satyam opened 1 day ago
For simplicity, JAX has removed its internal frames from the traceback of the following exception. Set JAX_TRACEBACK_FILTERING=off to include these.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>
I am running SpeedPPI on GPU nodes. But some of the jobs would run out of memory even with 250GB memory. The error says
RESOURCE_EXHAUSTED: Out of memory while trying to allocate 16508718128 bytes.
which means it was requesting 16.5 GB of memory. Even if I multiply by 10 for each Recycle, that would be 160 GB which still leaves 90GB extra. So I don't know what's happening!!