Out of memory while trying to allocate 58796148776 bytes during training

Hi, I have a Out of GPU memory when trying to train 360 data such as Bonsai, stump. The command is "python -m train --gin_configs=configs/360.gin --gin_bindings="Config.data_dir = '${DATA_DIR}'" --gin_bindings="Config.checkpoint_dir = '${DATA_DIR}/checkpoints'" --logtostderr"

Running on wsl (utuntu 20.04)

The output error is below.

Any suggestion? thanks!

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/ck/miniconda3/envs/multinerf/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/ck/miniconda3/envs/multinerf/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/mnt/c/gitcode/multinerf/train.py", line 288, in app.run(main) File "/home/ck/miniconda3/envs/multinerf/lib/python3.9/site-packages/absl/app.py", line 308, in run _run_main(main, args) File "/home/ck/miniconda3/envs/multinerf/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main sys.exit(main(argv)) File "/mnt/c/gitcode/multinerf/train.py", line 119, in main state, stats, rngs = train_pstep( jaxlib.xla_extension.XlaRuntimeError: RESOURCE_EXHAUSTED: Out of memory while trying to allocate 58796148776 bytes. BufferAssignment OOM Debugging. BufferAssignment stats: parameter allocation: 104.27MiB constant allocation: 128.5KiB maybe_live_out allocation: 103.09MiB preallocated temp allocation: 54.76GiB preallocated temp fragmentation: 232B (0.00%) total allocation: 54.86GiB total fragmentation: 30.49MiB (0.05%)

google-research / multinerf

Out of memory while trying to allocate 58796148776 bytes during training #110