I see that 500 epochs need to be trained, do I need to train so many epochs, which will take a long time to train.
and I also want to know whether your experimental results close to the original paper.
thanks.
No, 500 is a lot, you don't need that many epochs. My reproduced results are not exactly the same as the original paper (~1 point difference), since I didn't do too much hyper-parameter tuning.
I see that 500 epochs need to be trained, do I need to train so many epochs, which will take a long time to train. and I also want to know whether your experimental results close to the original paper. thanks.