Open aamir-mustafa-yoti opened 10 months ago
Ya, it makes no difference , and the change is introduced from fix BatchNormalization for QAT, fixing #115.
Thanks for clarification.
Another question: For how many epochs are the pre-trained 'latest_models' for EfficientNet v2S trained for?
It's 67 epochs, 50 epochs from scratch, and go on trained for another 17 epochs. EffcientNetV2S swish drop_conn 0.2 dropout 0.2 using SGD + L2 regularizer + cosine lr decay + randaug training on MS1MV3 dataset.
Hi, Thanks for the great work. I have noticed that the EfficientNetv2S checkpoints provided do not have the exact same last few layers as the code---> output_layer == "F":
The last few layers of the provided checkpoint are:
Whereas, the output_layer = F gives the following:
I understand that this should not make any difference to the model, but is there a particular reason for doing this?
Thanks