lukemelas / EfficientNet-PyTorch

A PyTorch implementation of EfficientNet
Apache License 2.0
7.85k stars 1.52k forks source link

Are lukemelas EfficientNet-Pytorch parameters correct? #112

Open gloriamacia opened 4 years ago

gloriamacia commented 4 years ago

I was comparing this implementation with the original paper and it seems to me that either I am missing something (apologies in advance) or the parameters he is using are not the same. This is his code which you will find in utils.py:

def efficientnet_params(model_name): """ Map EfficientNet model name to parameter coefficients. """ params_dict = {

Coefficients: width,depth,res,dropout

'efficientnet-b0': (1.0, 1.0, 224, 0.2),
'efficientnet-b1': (1.0, 1.1, 240, 0.2),
'efficientnet-b2': (1.1, 1.2, 260, 0.3),
'efficientnet-b3': (1.2, 1.4, 300, 0.3),
'efficientnet-b4': (1.4, 1.8, 380, 0.4),
'efficientnet-b5': (1.6, 2.2, 456, 0.4),
'efficientnet-b6': (1.8, 2.6, 528, 0.5),
'efficientnet-b7': (2.0, 3.1, 600, 0.5),

} return params_dict[model_name]

Now, when I go to the original paper, the authors say the following.

image

So basically if for instance alpha = 1.2 and depth = alpha ** phi, how do I get to Luke Melas table? I did not find a way to understand the numbers of his scaling. Did somebody review it?

There is one sentence of the paper that confuses me though:

we first fix phi=1 (…). We find best values for efficientNetB0 are alpha = 1.2, beta = 1.1, gamma = 1.15."

If I interpret this sentence correctly, w=1.0 d=1.0 and r=224 would correspond to EfficientNetB0 and these new parameters are for phi = 1, hence EfficientNetB1… again, different in Lukas table. Maybe not really an issue but certainly worth a clarification so I am raising this up.

EvgeniaAR commented 4 years ago

I think the code was copied directly from the original EfficientNet tf library: https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/efficientnet_builder.py#L35. I have the same question and I think that it should be addressed to the original Efficient Net developers.

EduardKononov commented 3 years ago

You can find the answer in this issue