Closed lessw2020 closed 4 years ago
@lessw2020 the model names without a tf_
prefix are for natively PyTorch trained weights. There aren't any weights beyond B2 (https://github.com/rwightman/gen-efficientnet-pytorch/blob/master/geffnet/gen_efficientnet.py#L69)
It takes a REALLY long time to train these models from scratch with the GPU setup of a mere mortal (2-3 gamer GPUs per machine). I actually have trained B3 recently and it took btw 3-4 weeks on dual Titan RTX. Those latest results and training hparasm are at (https://github.com/rwightman/pytorch-image-models) ... I will update the B2/B3 model here soon, was hoping to have more results first....
If you want the B5 AdvProp weights, those are ported from tf, you want to use tf_efficientnet_b5_ap
for that.
And yeah, I should have a 'random initialization, weights not loaded' warning, I thought I had it, but looks like that was in the other one :)
Also beware, the AdvProp models use different normalization (Inception style) and not the same as the other models (ResNet style)
@rwightman - thanks a bunch for the clarification, as well as the pointer re: Inception style normalization! Re: 2-3 weeks with 2-3 gamer GPUs - I believe it! All the more reason I greatly appreciate the work you are doing with this repro! My impression has always been EfficientNets are a beast to train and you've confirmed it here. Upside is I'm seeing the best results with EffNets of any arch I tried on my current project so I'll be diving deep with EffNets for some time now.
re: "And yeah, I should have a 'random initialization, weights not loaded' warning, I thought I had it, but looks like that was in the other one :)" That would be awesome if you have time to add that just to help people from panicking like I did when I saw my results after the first two epochs lol.
Anyway I'll close this since there are no weights to preload and again, I greatly appreciate all the work you are doing on this repo!
Hi @rwightman, Thanks first for this awesome repo. I'm trying to use your impl to get the AP pre-trained B5, but it's quite clearly failing to load the pretrained weights though with neither an error nor a confirm the weights were loaded. Is this a known issue or am I doing something wrong? (edit - ok I re-read the readme and think I misunderstood that AP implemented meant with pretrained weights available...anyway, if so then passing preTrained=True where no weights exist should ideally print a warning or error?)
1 - Installed via pip install geffnet 2 - Import geffnet 3 - model = geffnet.create_model('efficientnet_b5',num_classes = data.c,pretrained=True, drop_rate=0.2, drop_connect_rate=0.2)#, as_sequential=True) Normally I'm used to seeing a "loading .pth and the progress bar here on a new instance, or a confirmation of weights loaded. I did not see either but no error either. 4 - When you go to train it becomes abundantly clear that it's working with a new init network (i.e. first epoch close to random, then verrry slow training progress. By contrast a pre-trained digs right in.
If possible, it would be great to get a confirmation message like in Melas impl once weights are loaded: "Loaded pretrained weights for efficientnet-b5" or if not a warning that it's a new network if pretrained=True was passed in?
Thanks much! Less