I'm trying to understand device allocation here. I have different GPUs capacities, and the program stops with OOM during backward() even when free space is available.
In the code, I see two critical parts for GPU alloc:
In class StyleTransfer, you create a device plan to spead the load of the 27ish layers of VGG over GPUs.
Hi,
I'm trying to understand device allocation here. I have different GPUs capacities, and the program stops with OOM during
backward()
even when free space is available.In the code, I see two critical parts for GPU alloc:
In class StyleTransfer, you create a device plan to spead the load of the 27ish layers of VGG over GPUs.
meaning you send 5 first layers to GPU0 and all other to GPU1.
In the stylize main loop, you actually send all images and styles to GPU0:
so I'm not sure whether the load is spread at all during the backward descent.
How would you see a version where the load is spread evenly depending of each capacity?
Regards, J