HorizonRobotics / alf

Agent Learning Framework https://alf.readthedocs.io
Apache License 2.0
300 stars 50 forks source link

grocery_ground_goal_task training taking 3x more memory than expected? #183

Open le-horizon opened 5 years ago

le-horizon commented 5 years ago

4 bytes (float) 80 80 (image size) 3 (channels) 100 (unroll length) 12 (input + two conv layers + backprop + framestack) 30 (parallel envs) /1000,000,000 =2.7 GB

Currently cuda seems to be taking ~9 GB GPU memory, (rendering is taking another ~4GB): +-----------------------------------------------------------------------------+ | NVIDIA-SMI 418.56 Driver Version: 418.56 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce RTX 208... Off | 00000000:17:00.0 On | N/A | | 18% 59C P5 46W / 250W | 5274MiB / 10988MiB | 31% Default | +-------------------------------+----------------------+----------------------+ | 1 GeForce RTX 208... Off | 00000000:65:00.0 Off | N/A | | 30% 66C P2 81W / 250W | 8856MiB / 10989MiB | 3% Default | +-------------------------------+----------------------+----------------------+

This seems to suggest it is taking 3x the memory vs what we expect?

Did we miss anything in the calculation?

Thanks, Le -----some details----- conv_layer_params = ((16, 3, 2), (32, 3, 2)) 1st conv layer 404016, 2nd conv layer 202032, roughly adds up to 2x input layer, then 2 for actor and critic networks, and 2 again for forward and backprop.

le-horizon commented 5 years ago

We can probably store the input in int8 instead of float, to reduce the size a lot. @emailweixu what do you think?