Closed whuzs closed 10 months ago
Hi @whuzs
I guess last.ckpt is a linked file to the last checkpoint, so it should be that. The small difference in performance might be explainable due to training noise. There is always a little bit of variance between different training runs, specially with an aggressive learning rate like the proposed with just 4 epochs. Also recall that although training is done on 224x224, evaluation is on 322x322. You also might get a little bit of performance gain by training and/or evaluating on full precision instead of mixed.
You also have our trained checkpoint that we used to run our experiments.
Hi, serizba Thanks for the great work.
I repeated the code you provided three times, and the result is as follows. May I ask how to choose a trained model? I am currently using last.ckpt. less than 92.2