Open carefree0910 opened 3 years ago
I'm also quite confused on how to do the correct backward pass in this case. For the CycleGAN example, we calculate the cycle_loss
depend on net_a2b
& net_b2a
:
cycle_loss = loss_fn(net_a2b, net_b2a)
and we want to do one backward pass instead of two, and want to avoid retain_graph=True
. However, if we stick to engine.backward
in deepspeed
, it seems impossible to do so.
Sometimes (e.g. CycleGAN) we need to optimize two (or more) models' parameters together because it will be more efficient (e.g. when optimizing cycle loss we definitely don't want to use
retain_graph=True
).I was just wondering whether this is the right way to initialize
optimizer
which aims to optimize bothnet_a2b
's &net_b2a
's parameters:Any help would be very grateful, thanks in advance!