Balance between the perturbation loss and adversarial loss. Or other training strategy?

Hello, I applied your model with the same optimizer and learning rate as yours. For the adversarial loss, I used the cosine similarity (the same as yours, too, if I didn't misunderstand). While from the second or third epoch, the perturbation loss has reduced to 0 while the cosine similarity is up to 1. I have tried 1:10, 1:100, and 1:2000 three different proportions of pert loss and adv loss, but none works. Have you ever had similar problem or what other strategy for training did you applied? Thanks.

ronny3050 / AdvFaces

Balance between the perturbation loss and adversarial loss. Or other training strategy? #11