osmr / imgclsmob

Sandbox for training deep learning networks
MIT License
2.95k stars 562 forks source link

No pre-trained weights for the DPN models? #56

Closed djpecot closed 4 years ago

djpecot commented 4 years ago

Hello,

I am using fastai to implement some computer vision project. I recently stumbled across this repo and found it a great add-on to my arsenal of models. However, I am trying to create a pretrained DPN98. I found someone in the fastai repo that is having a similar problem with Alexnet (though they didn't specify where they got the model from) here. I also tried DPN 68 with same result.

The traceback I get looks like this:

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-21-096da45f530c> in <module>
      1 # Setup the model and
----> 2 learn = cnn_learner(data, dpn98, metrics=[precision, accuracy], callback_fns=ShowGraph)
      3 learn.model_dir='/kaggle/working/'
      4 learn.freeze()

/opt/conda/lib/python3.6/site-packages/fastai/vision/learner.py in cnn_learner(data, base_arch, cut, pretrained, lin_ftrs, ps, custom_head, split_on, bn_final, init, concat_pool, **kwargs)
     96     meta = cnn_config(base_arch)
     97     model = create_cnn_model(base_arch, data.c, cut, pretrained, lin_ftrs, ps=ps, custom_head=custom_head,
---> 98         bn_final=bn_final, concat_pool=concat_pool)
     99     learn = Learner(data, model, **kwargs)
    100     learn.split(split_on or meta['split'])

/opt/conda/lib/python3.6/site-packages/fastai/vision/learner.py in create_cnn_model(base_arch, nc, cut, pretrained, lin_ftrs, ps, custom_head, bn_final, concat_pool)
     84     body = create_body(base_arch, pretrained, cut)
     85     if custom_head is None:
---> 86         nf = num_features_model(nn.Sequential(*body.children())) * (2 if concat_pool else 1)
     87         head = create_head(nf, nc, lin_ftrs, ps=ps, concat_pool=concat_pool, bn_final=bn_final)
     88     else: head = custom_head

/opt/conda/lib/python3.6/site-packages/fastai/callbacks/hooks.py in num_features_model(m)
    118     sz = 64
    119     while True:
--> 120         try: return model_sizes(m, size=(sz,sz))[-1][1]
    121         except Exception as e:
    122             sz *= 2

/opt/conda/lib/python3.6/site-packages/fastai/callbacks/hooks.py in model_sizes(m, size)
    111     "Pass a dummy input through the model `m` to get the various sizes of activations."
    112     with hook_outputs(m) as hooks:
--> 113         x = dummy_eval(m, size)
    114         return [o.stored.shape for o in hooks]
    115 

/opt/conda/lib/python3.6/site-packages/fastai/callbacks/hooks.py in dummy_eval(m, size)
    106 def dummy_eval(m:nn.Module, size:tuple=(64,64)):
    107     "Pass a `dummy_batch` in evaluation mode in `m` with `size`."
--> 108     return m.eval()(dummy_batch(m, size))
    109 
    110 def model_sizes(m:nn.Module, size:tuple=(64,64))->Tuple[Sizes,Tensor,Hooks]:

/opt/conda/lib/python3.6/site-packages/fastai/callbacks/hooks.py in dummy_batch(m, size)
    101 def dummy_batch(m: nn.Module, size:tuple=(64,64))->Tensor:
    102     "Create a dummy batch to go through `m` with `size`."
--> 103     ch_in = in_channels(m)
    104     return one_param(m).new(1, ch_in, *size).requires_grad_(False).uniform_(-1.,1.)
    105 

/opt/conda/lib/python3.6/site-packages/fastai/torch_core.py in in_channels(m)
    261     for l in flatten_model(m):
    262         if hasattr(l, 'weight'): return l.weight.shape[1]
--> 263     raise Exception('No weight layer')
    264 
    265 class ModelOnCPU():

Exception: No weight layer

I'll include the proceeding code I have up to this point to help you get an idea of what environment I am running. Note, this is in Kaggle.

from fastai.vision import *
# For more models
!pip install pytorchcv
from pytorchcv.model_provider import get_model as ptcv_get_model

# Data augs
tfms = get_transforms()

# Get databunch to feed to the network
data = ImageDataBunch.from_folder(path, valid_pct=0.2, size = 1028, bs = 2, ds_tfms = tfms, padding_mode='zeros').normalize(imagenet_stats)

# Custom arch
def dpn98(pretrained=False):
    return ptcv_get_model("dpn98", pretrained=False).features

# Get custom precision metric
precision = Precision(pos_label = 0)

# Setup the model and 
learn = cnn_learner(data, dpn98, metrics=[precision, accuracy], callback_fns=ShowGraph)
osmr commented 4 years ago

Hi, The error message is rather ambiguous. So are you using return ptcv_get_model("dpn98",pretrained=False).features?

djpecot commented 4 years ago

@osmr Ahhh, good catch, but even when I set pretrained = True, I get the same error:

# Custom arch
def dpn98(pretrained=True):
    return ptcv_get_model("dpn98", pretrained=True).features

# Get custom precision metric
precision = Precision(pos_label = 0)

# Setup the model and 
learn = cnn_learner(data, dpn98, metrics=[precision, accuracy], callback_fns=ShowGraph)
learn.freeze()

Here's the traceback (what's interesting is it actually succesfully downloads the .pth file, but seems to not load it maybe?)

Downloading /root/.torch/models/dpn98-0553-52c55969.pth.zip from https://github.com/osmr/imgclsmob/releases/download/v0.0.17/dpn98-0553-52c55969.pth.zip...
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-13-887045df32a1> in <module>()
----> 1 learn = cnn_learner(data, dpn98, metrics=[precision, accuracy], callback_fns=ShowGraph)
      2 learn.freeze()

6 frames
/usr/local/lib/python3.6/dist-packages/fastai/vision/learner.py in cnn_learner(data, base_arch, cut, pretrained, lin_ftrs, ps, custom_head, split_on, bn_final, init, concat_pool, **kwargs)
     96     meta = cnn_config(base_arch)
     97     model = create_cnn_model(base_arch, data.c, cut, pretrained, lin_ftrs, ps=ps, custom_head=custom_head,
---> 98         bn_final=bn_final, concat_pool=concat_pool)
     99     learn = Learner(data, model, **kwargs)
    100     learn.split(split_on or meta['split'])

/usr/local/lib/python3.6/dist-packages/fastai/vision/learner.py in create_cnn_model(base_arch, nc, cut, pretrained, lin_ftrs, ps, custom_head, bn_final, concat_pool)
     84     body = create_body(base_arch, pretrained, cut)
     85     if custom_head is None:
---> 86         nf = num_features_model(nn.Sequential(*body.children())) * (2 if concat_pool else 1)
     87         head = create_head(nf, nc, lin_ftrs, ps=ps, concat_pool=concat_pool, bn_final=bn_final)
     88     else: head = custom_head

/usr/local/lib/python3.6/dist-packages/fastai/callbacks/hooks.py in num_features_model(m)
    118     sz = 64
    119     while True:
--> 120         try: return model_sizes(m, size=(sz,sz))[-1][1]
    121         except Exception as e:
    122             sz *= 2

/usr/local/lib/python3.6/dist-packages/fastai/callbacks/hooks.py in model_sizes(m, size)
    111     "Pass a dummy input through the model `m` to get the various sizes of activations."
    112     with hook_outputs(m) as hooks:
--> 113         x = dummy_eval(m, size)
    114         return [o.stored.shape for o in hooks]
    115 

/usr/local/lib/python3.6/dist-packages/fastai/callbacks/hooks.py in dummy_eval(m, size)
    106 def dummy_eval(m:nn.Module, size:tuple=(64,64)):
    107     "Pass a `dummy_batch` in evaluation mode in `m` with `size`."
--> 108     return m.eval()(dummy_batch(m, size))
    109 
    110 def model_sizes(m:nn.Module, size:tuple=(64,64))->Tuple[Sizes,Tensor,Hooks]:

/usr/local/lib/python3.6/dist-packages/fastai/callbacks/hooks.py in dummy_batch(m, size)
    101 def dummy_batch(m: nn.Module, size:tuple=(64,64))->Tensor:
    102     "Create a dummy batch to go through `m` with `size`."
--> 103     ch_in = in_channels(m)
    104     return one_param(m).new(1, ch_in, *size).requires_grad_(False).uniform_(-1.,1.)
    105 

/usr/local/lib/python3.6/dist-packages/fastai/torch_core.py in in_channels(m)
    261     for l in flatten_model(m):
    262         if hasattr(l, 'weight'): return l.weight.shape[1]
--> 263     raise Exception('No weight layer')
    264 
    265 class ModelOnCPU():

Exception: No weight layer
osmr commented 4 years ago

OK. My next guess: Look why you cut the feature extractor from the model, but do not add a custom classifier.

djpecot commented 4 years ago

I apologize if my answer is not totally straightforward: I am super new to deep learning :P

Just to clarify terminology, the feature extractor is the body, while the classifier is the head, correct?

As for the feature extractor, I don't believe that the method I am calling does that. At least, from reading the trackback I can't tell. I think it only replaces the head (see below) but keeps the body/feature extractor.

I looked into fastai's documentation here and found that the object method automatically cuts the model "at the last convolutional layer by default". It then proceeds to add a head that contains:

an AdaptiveConcatPool2d layer, a Flatten layer, blocks of [nn.BatchNorm1d, nn.Dropout, nn.Linear, nn.ReLU] layers.

I tested this to see for myself on a resnet34 architecture:

# Custom arch
def resnet34(pretrained=True):
    return ptcv_get_model("resnet34", pretrained=True).features

# Get custom precision metric
precision = Precision(pos_label = 0)

# Setup the model and 
model_name = 'resnet34_sd'
learn = cnn_learner(data, resnet34, metrics=[precision, accuracy], callback_fns=ShowGraph)

After calling the learn object, I get the following output for the head:

Sequential(
  (0): AdaptiveAvgPool2d(output_size=1)
  (1): AdaptiveMaxPool2d(output_size=1)
  (2): Flatten()
  (3): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (4): Dropout(p=0.25, inplace=False)
  (5): Linear(in_features=1024, out_features=512, bias=True)
  (6): ReLU(inplace=True)
  (7): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (8): Dropout(p=0.5, inplace=False)
  (9): Linear(in_features=512, out_features=2, bias=True)

From what I understand about fastai, the cnn_learner method constructs an object that automatically sets the output layer based on data.c, or the number of classes you are trying to classify.

osmr commented 4 years ago

Could you prepare a reproducible script with a little fake data to run? The corresponding archive can be shared, for example, via some file hosting service.

djpecot commented 4 years ago

@osmr Sorry for the delay. I just gave your email access to the sample Google Colab notebook with the error called "DPN96 Sample Run.ipynb". I added you to have edit privileges.

If anyone else wants to tinker with it, take a look at this link (view only) https://colab.research.google.com/drive/1yuBDP9y0hmRr-DGgArv8HpEOl_3v3t2i

osmr commented 4 years ago

OK. Main idea is that FastAI is not intelligent enough. There is no need to feed it with a feature extractor. But you have to transform the model into a way that it understands. Examples:

def resnet10(pretrained=True):
    net = ptcv_get_model("resnet10", pretrained=pretrained)
    net2 = nn.Sequential()
    net2.add_module("features", net.features[:-1])
    net2.add_module("final_pool", nn.AdaptiveAvgPool2d(output_size=1))
    net2.add_module("fc", nn.Linear(in_features=512, out_features=1000))
    return net2
def dpn68(pretrained=True):
    net = ptcv_get_model("dpn68", pretrained=pretrained)
    net2 = nn.Sequential()
    net2.add_module("features", net.features)
    net2.add_module("final_pool", nn.AdaptiveAvgPool2d(output_size=1))
    net2.add_module("fc", nn.Linear(in_features=832, out_features=1000))
    return net2
def dpn98(pretrained=True):
    net = ptcv_get_model("dpn98", pretrained=pretrained)
    net2 = nn.Sequential()
    net2.add_module("features", net.features)
    net2.add_module("final_pool", nn.AdaptiveAvgPool2d(output_size=1))
    net2.add_module("fc", nn.Linear(in_features=2688, out_features=1000))
    return net2