Open learntolearn11 opened 5 years ago
Thanks for your reply, I load the pre-train model and get the 75% acc on A->W, is there any step to improve the result to 80%
I'm not sure the reason you got 75% acc. I run the code yesterday, the step-validation data of A-W is uploaded here https://drive.google.com/open?id=185EwktvOJLHrucAj22nP9yteGQyoaoCR for your reference.
sorry ive been busy recently. I guess gradient explosion happens in the early training stage, so no matter how many epochs you train, the acc is still about 1/31 (3%. I checked the default LR and its ok 1e-2. maybe you can try smaller LR, or check whether the pretrain model is loaded correctly. To my experience, gradient explosion usually happens within 100 epochs if the code is somewhere wrong. if you find anything please inform me.