What is the equivalent for batch overfitting for such a training scheme?

hlcdyy / pan-motion-retargeting

codes for paper "Pose-aware Attention Network for Flexible Motion Retargeting by Body Part" (TVCG2023)

BSD 2-Clause "Simplified" License

108 stars 7 forks source link

I've the following understanding:

The idea is to see if the Generator, when trained exclusively on the small batch, can produce samples that the Discriminator thinks are real. If your network setup and loss functions are working correctly, the Discriminator should become uncertain about whether the samples are real or fake (i.e., its loss should hover around the value indicating a 50% guess). The Generator's loss should decrease, showing it's generating better samples.

How to achieve this quickly to ensure everything in training is setup properly. I'm not getting such a behavior.

The adversarial loss in our design is used along with other loss terms (reconstruction loss, cyclic consistency loss) to optimize the model parameters during the training process, so it is difficult for the output of the discriminator to hover around the value indicating a 50% guess as GAN-based image generation methods which only use a single adversarial loss when training. However, the adversarial loss still plays a role by forcing the retargeted motion to fall into the target motion manifold.

hlcdyy / pan-motion-retargeting

What is the equivalent for batch overfitting for such a training scheme? #4