Closed AlanChou closed 5 years ago
"I would like to know the reason of using BCE instead of CrossEntropy. Is this critical to Manifold Mixup?"
I don't think it's critical, but I'll check if we have any Manifold Mixup results on these datasets with CrossEntropy instead of BCE.
"Is this also the reason you train 2000 epochs which is much longer than the common training schedules?"
We got improvements with many fewer epochs (definitely with 600 epochs) but the improvement gap grew when we used 2000 epochs.
Thank you very much for the super fast reply !
Hi,
I would like to know the reason of using BCE instead of CrossEntropy. Is this critical to Manifold Mixup? Is this also the reason you train 2000 epochs which is much longer than the common training schedules?