Is there anything wrong with training?

edgarschnfld / CADA-VAE-PyTorch

Official implementation of the paper "Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders" (CVPR 2019)

MIT License

283 stars 57 forks source link

Is there anything wrong with training? #2

Closed garyliu0816 closed 5 years ago

garyliu0816 commented 5 years ago

when using the cmd to train the model:

python single_experiment.py --dataset CUB --num_shots 0 --generalized True

It works fine before 10 epoch, but then the loss increase from 3358 to 14401.

epoch 10 | iter 0 | loss 3358. epoch 99 | iter 100 | loss 14401.

Is this situation right？

edgarschnfld commented 5 years ago

This situation is right. The reason is, that the weight of all loss terms is continuously increased (but at different rates and over different intervals). For example, the weight of the cross-reconstruction loss is more than doubled from epoch 20 to 70.

garyliu0816 commented 5 years ago

Thank you for your answer. If the loss continuously increase, how can I determine when to stop training?

This situation is right. The reason is, that the weight of all loss terms is continuously increased (but at different rates and over different intervals). For example, the weight of the cross-reconstruction loss is more than doubled from epoch 20 to 70.

edgarschnfld commented 5 years ago

You can do that via trial and error on the validation set. I have experienced that the VAE-training is very robust to overfitting, and that the performance does not get worse with longer training times. For example, it makes a difference if you train 40 or 100 epochs (the performance will increase), but there is not much difference in performance between 100 and 1000 epochs (overall performance is basically stable). In effect, it means that trying out some different number of epochs should suffice to know when to stop training.

garyliu0816 commented 5 years ago

Thanks again, this work is awesome.