Ain't able to train text to 3d in low memory!

ashawkey / stable-dreamfusion

Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.

Apache License 2.0

8.27k stars 728 forks source link

Ain't able to train text to 3d in low memory! #239

Open wyiguanw opened 1 year ago

wyiguanw commented 1 year ago

hi, I'm currently training your work on RTX3090 with 24G memory. when I tried "python main.py --text "a hamburger" --workspace trial -O --vram_O --w 300 --h 300" , my CUDA was out of memory. but according to your comment, you tested it on 8G GPU.

I'm using default SD version(2.1).

what may cause this issue? did I miss anything?

csawtelle commented 1 year ago

This can happen if this is your only GPU and something else is using up some of the memory. I ran into the same issue and using the second GPU that wasn't running my Ubuntu desktop fixed it. I suppose somehow there is a way to cap it so it doesn't try to use 100% of the memory.

wyiguanw commented 1 year ago

thanks for reply! I do have second GPU, also RTX3090, which memory was only used 10M before I run this training. still have same issue. currently, I 'm only able to run it with 128x128 resolution, which costs 19G memory. so basically, resolution on longer can be increased.

and if I run it without "--vram_O", it will still cost 19G for 128x128 resolution.

so seems like "--vram_O" does not work in my case.

ashawkey commented 1 year ago

@csawtelle Hi, actually I have changed some network structure after vram_O (revert to finite difference for normal evaluation, which increases memory cost largely but makes the current as_latent geometry initialization work), so the information is not correct in readme. I'll correct and improve it later.

claforte commented 1 year ago

In case anyone else is interested, I also ran into the same issue:

python main.py --text "a hamburger" --workspace trial -O --vram_O --w 64 --h 64

Condition   Mem usage reported by nvidia-smi
---------   --------------------------------
idle         3.2GB
Varying h/w:
  8x8        6.4GB
 64x64      10.6GB
128x128     18.4GB
256x256     OOM on RTX4090 24GB VRAM