Out of memory error running 2GPU RTX 3060 and RTX 4070

I have a RTX 3060 and RTX 4070 in my system, both 12GB. Since the X server runs on my RTX 4070 I have only about 11GB VRAM there so with X server running, I can run the single GPU script on the project page successfully only on the RTX 3060. If I switch to run mode 3 (no X sever) then I can run that script on either GPU. I updated my git repo to current code as of today, Oct 15 I tried text to video with the scripts/inference_multigpu.sh script. I changed the inference_multigpu.py script to set cpu_offloading=True in both places and that did not help. I tried adding model.enable_sequential_cpu_offload() and that did not help. I tried adding model.enable_sequential_cpu_offload() just before the model.to.vae(device) statement and that did not help. I get the out of memory error for both the 384P and 768P models. Is the intent of the multi-GPU support to cut memory usage in each GPU by about half and split a single frame's generation across both GPUs or is it to speed up generation by generating single frame each on separate GPUs to shorten run time?

jy0205 / Pyramid-Flow

Out of memory error running 2GPU RTX 3060 and RTX 4070 #105