vikasverma1077 / manifold_mixup

Code for reproducing Manifold Mixup results (ICML 2019)
483 stars 65 forks source link

Question about BCE #7

Closed AlanChou closed 5 years ago

AlanChou commented 5 years ago

Hi,

I would like to know the reason of using BCE instead of CrossEntropy. Is this critical to Manifold Mixup? Is this also the reason you train 2000 epochs which is much longer than the common training schedules?

alexmlamb commented 5 years ago

"I would like to know the reason of using BCE instead of CrossEntropy. Is this critical to Manifold Mixup?"

I don't think it's critical, but I'll check if we have any Manifold Mixup results on these datasets with CrossEntropy instead of BCE.

"Is this also the reason you train 2000 epochs which is much longer than the common training schedules?"

We got improvements with many fewer epochs (definitely with 600 epochs) but the improvement gap grew when we used 2000 epochs.

AlanChou commented 5 years ago

Thank you very much for the super fast reply !