PPPW / deep-learning-random-explore

194 stars 34 forks source link

PNAS-5 Large cannot be splitted #6

Closed hwasiti closed 5 years ago

hwasiti commented 5 years ago

There is a weird StopIteration: error happens only with this model.

Example:

from fastai.vision.learner import model_meta

def identity(x): return x

def pnasnet5large(pretrained=False):    
    pretrained = 'imagenet' if pretrained else None
    model = models.cadene_models.pretrainedmodels.pnasnet5large(pretrained=pretrained, num_classes=1000) 
    model.logits = identity
    return nn.Sequential(model)

model_meta[pnasnet5large] =  { 'cut': None, 
                               'split': lambda m: (list(m[0][0].children())[8], m[1]) }

learn = cnn_learner(data, base_arch=pnasnet5large, pretrained=False)

Error:


---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-79-2126a555d117> in <module>
----> 1 learn = cnn_learner(data, base_arch=pnasnet5large, pretrained=False)

~/anaconda3/envs/fastai-v1/lib/python3.7/site-packages/fastai/vision/learner.py in cnn_learner(data, base_arch, cut, pretrained, lin_ftrs, ps, custom_head, split_on, bn_final, init, concat_pool, **kwargs)
     95     meta = cnn_config(base_arch)
     96     model = create_cnn_model(base_arch, data.c, cut, pretrained, lin_ftrs, ps=ps, custom_head=custom_head,
---> 97         split_on=split_on, bn_final=bn_final, concat_pool=concat_pool)
     98     learn = Learner(data, model, **kwargs)
     99     learn.split(split_on or meta['split'])

~/anaconda3/envs/fastai-v1/lib/python3.7/site-packages/fastai/vision/learner.py in create_cnn_model(base_arch, nc, cut, pretrained, lin_ftrs, ps, custom_head, split_on, bn_final, concat_pool)
     81         split_on:Optional[SplitFuncOrIdxList]=None, bn_final:bool=False, concat_pool:bool=True):
     82     "Create custom convnet architecture"
---> 83     body = create_body(base_arch, pretrained, cut)
     84     if custom_head is None:
     85         nf = num_features_model(nn.Sequential(*body.children())) * (2 if concat_pool else 1)

~/anaconda3/envs/fastai-v1/lib/python3.7/site-packages/fastai/vision/learner.py in create_body(arch, pretrained, cut)
     57     if cut is None:
     58         ll = list(enumerate(model.children()))
---> 59         cut = next(i for i,o in reversed(ll) if has_pool_type(o))
     60     if   isinstance(cut, int):      return nn.Sequential(*list(model.children())[:cut])
     61     elif isinstance(cut, Callable): return cut(model)

StopIteration: 
PPPW commented 5 years ago

Hi @hwasiti, I haven't updated the repo for a while, actually I have already added these models to fastai in "fastai/fastai/vision/models/cadene_models.py", you can use it directly.

Your error is caused by a change in fastai's create_body function. You can use use the "cadene_models.py" directly, the trick is to set cut to noop rather than None.

hwasiti commented 5 years ago

Hi @PPPW Yes I have seen that you have added it to fastai/fastai/vision/models/cadene_models.py. And it was much nicer to use it from there. But when I have tried it, and checked that whether there are proper split points by: get_groups(nn.Sequential(*learn.model[0], *learn.model[1]), learn.layer_groups)

returned nothing.

So I supposed that maybe the differential learning rates would not work. That's why I tried the manual split method by using the above code.

Is my understanding correct that there is no spliting to groups happening with your suggestion?

Also there are few other models have not been added yet to fastai/fastai/vision/models/cadene_models.py like se_resnext101_32x4d, but that worked fine by the above code without any issues.

I haven't quite understood your workaround: _You can use use the "cadenemodels.py" directly, the trick is to set cut to noop rather than None.

PPPW commented 5 years ago

Hi @hwasiti, I see, are you talking about "pnasnet5large"? The Cadene's implementation of this one is different from other models, so we have to do something different. One way is to set cut to noop rather than None in the model_meta[pnasnet5large] (the code in your original post). This is because of a change in create_body, my notebook was out of date.

Alternatively, you can import and use the models directly, just like ResNet. I have some examples in this notebook. This is the notebook I used to test "cadene_models.py" before submitting the pull request, so you can find some examples there.

Hope it helps!

hwasiti commented 5 years ago

Great!

Both your methods worked.

It did not returned any groups because I had to check pnasnet5large groups by: 'get_groups(nn.Sequential(learn.model[0][0].children(), learn.model[1]), learn.layer_groups)'

which is a bit different than others with the extra models dimension.

Many thanks!

PPPW commented 5 years ago

Sounds good, glad it helps!