sacmehta / EdgeNets

This repository contains the source code of our work on designing efficient CNNs for computer vision

MIT License

411 stars 82 forks source link

convert pytorch model into onnx version [Detection part] #24

Closed zechendev closed 4 years ago

zechendev commented 4 years ago

Thank you for sharing this great code. Right row, I want to deploy your model in to tvm platform, which may need conversion between pytorch and onnx, the code I used is like below.

weights = 'model/detection/model_zoo/espnetv2/espnetv2_s_2.0_pascal_300x300.pth' model = ssd(args, cfg) pretrained_dict = torch.load(weights, map_location=torch.device('cpu')) model.load_state_dict(pretrained_dict) PATH_ONNX='deploy.onnx' dummy_input = torch.randn(1, 3, 300, 300, device='cpu') torch.onnx.export(model, dummy_input, PATH_ONNX, input_names = ['image'], output_names= ['output'], verbose=True,opset_version=11)

but during the conversion, an error occurs,the info is below:

~/software/EdgeNets/nn_layers/eesp.py:139: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if w2 == w1: ~/software/EdgeNets/nn_layers/eesp.py:89: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if expanded.size() == input.size(): ~/software/EdgeNets/nn_layers/efficient_pyramid_pool.py:44: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! h_s = int(math.ceil(height self.scales[i])) ~/software/EdgeNets/nn_layers/efficient_pyramid_pool.py:45: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! w_s = int(math.ceil(width self.scales[i]))

raise RuntimeError("Failed to export an ONNX attribute, " RuntimeError: Failed to export an ONNX attribute, since it's not constant, please try to make things (e.g., kernel size) static if possible

please give some tips, which I can figure out the problem. thank you for your help!

lawo123 commented 4 years ago

Thank you for sharing this great code. Right row, I want to deploy your model in to tvm platform, which may need conversion between pytorch and onnx, the code I used is like below.

weights = 'model/detection/model_zoo/espnetv2/espnetv2_s_2.0_pascal_300x300.pth' model = ssd(args, cfg) pretrained_dict = torch.load(weights, map_location=torch.device('cpu')) model.load_state_dict(pretrained_dict) PATH_ONNX='deploy.onnx' dummy_input = torch.randn(1, 3, 300, 300, device='cpu') torch.onnx.export(model, dummy_input, PATH_ONNX, input_names = ['image'], output_names= ['output'], verbose=True,opset_version=11)

but during the conversion, an error occurs,the info is below:

~/software/EdgeNets/nn_layers/eesp.py:139: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if w2 == w1: ~/software/EdgeNets/nn_layers/eesp.py:89: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if expanded.size() == input.size(): ~/software/EdgeNets/nn_layers/efficient_pyramid_pool.py:44: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! h_s = int(math.ceil(height self.scales[i])) ~/software/EdgeNets/nn_layers/efficient_pyramid_pool.py:45: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! w_s = int(math.ceil(width self.scales[i]))

raise RuntimeError("Failed to export an ONNX attribute, " RuntimeError: Failed to export an ONNX attribute, since it's not constant, please try to make things (e.g., kernel size) static if possible

please give some tips, which I can figure out the problem. thank you for your help!

I encountered the same problem。EdgeNets/nn_layers/efficient_pyramid_pool.py
h = F.adaptive_avg_pool2d(h, output_size=(height, width)) changes :h = F.adaptive_avg_pool2d(h, output_size=((int)height,(int) width)) and your code : output_names= ['output'], verbose=True,opset_version=11) changes output_names= ['output'], verbose=True,opset_version=11,operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK) .it works. But onnx to other models,F.adaptive_avg_pool2d will occur some problems.you can try repleace it with avg_pool2d.

sacmehta commented 4 years ago

The problem that you are encountering is because PyTorch supports fractional pooling i.e. you can pool the tensors to sizes like 0.1 of the actual size. However, ONNX does not support such cases.

We modified the code a bit while we converted our models to ONNX.

import torch
from torch import nn
import math
from torch.nn import functional as F
from nn_layers.cnn_utils import CBR, BR, Shuffle

class Identity(nn.Module):
    def forward(self, x):
        return x

class Interpolate(nn.Module):
    def __init__(self, h, w):
        super(Interpolate, self).__init__()
        self.h = h
        self.w = w

    def forward(self, x):
        return F.upsample(x, size=(self.h, self.w), mode='bilinear', align_corners=False)
        #return F.interpolate(x, size=(self.h, self.w), mode='nearest')#mode='bilinear', align_corners=False)

class EfficientPyrPool(nn.Module):
    """Efficient Pyramid Pooling Module"""

    def __init__(self, in_planes, proj_planes, out_planes, inp_size=None, out_size=None,
                 scales=[2.0, 1.5, 1.0, 0.5, 0.1],  last_layer_br=True):
        super(EfficientPyrPool, self).__init__()
        self.stages = nn.ModuleList()
        scales.sort(reverse=True)

        self.projection_layer = CBR(in_planes, proj_planes, 1, 1)
        for _ in enumerate(scales):
            self.stages.append(nn.Conv2d(proj_planes, proj_planes, kernel_size=3, stride=1, padding=1, bias=False, groups=proj_planes))
        self.merge_layer = nn.Sequential(
            # perform one big batch normalization instead of p small ones
            BR(proj_planes * len(scales)),
            Shuffle(groups=len(scales)),
            CBR(proj_planes * len(scales), proj_planes, 3, 1, groups=proj_planes),
            nn.Conv2d(proj_planes, out_planes, kernel_size=1, stride=1, bias=not last_layer_br),
        )

        if last_layer_br:
            self.br = BR(out_planes)

        layer_scale_a = []
        layer_scale_b = []
        for i, sc in enumerate(scales):
            if sc < 1.0:
                layer_scale_a.append(AdaptivePool(input_size=inp_size, output_size=out_size[i])) #(nn.AdaptiveAvgPool2d(output_size=out_size[i]))
                layer_scale_b.append(Interpolate(inp_size[0], inp_size[1]))
            elif sc > 1.0:
                h, w = out_size[i]
                layer_scale_a.append(Interpolate(h, w))
                layer_scale_b.append(AdaptivePool(output_size=inp_size, input_size=out_size[i]))#(nn.AdaptiveAvgPool2d(output_size=inp_size))
            else:
                layer_scale_a.append(Identity())
                layer_scale_b.append(Identity())

        self.layer_scale_a = layer_scale_a
        self.layer_scale_b = layer_scale_b

        self.last_layer_br = last_layer_br
        self.scales = scales

    def forward(self, x):
        hs = []
        x = self.projection_layer(x)

        for i, stage in enumerate(self.stages):
            h = self.layer_scale_a[i](x)
            h = stage(h)
            h = self.layer_scale_b[i](h)
            hs.append(h)

        out = torch.cat(hs, dim=1)
        out = self.merge_layer(out)
        if self.last_layer_br:
            return self.br(out)
        return out

class AdaptivePool(nn.Module):
    __constants__ = ['stride_h', 'stride_w', ]
    def __init__(self, input_size, output_size):
        super(AdaptivePool, self).__init__()
        stride_h = int(math.ceil(input_size[0] / output_size[0]))
        stride_w = int(math.ceil(input_size[1] / output_size[1]))
        k_size_h = input_size[0] - (output_size[0] - 1) * stride_h
        k_size_w = input_size[1] - (output_size[1] - 1) * stride_w
        #print(stride_h, stride_w, k_size_h, k_size_w)
        self.k_size_h = k_size_h
        self.k_size_w = k_size_w
        self.stride_h = stride_h
        self.stride_w = stride_w
        #self.layer = nn.AvgPool2d(kernel_size=(k_size_h, k_size_w), stride=(stride_h, stride_w))

    def forward(self, x):
        return nn.AvgPool2d(kernel_size=(self.k_size_h, self.k_size_w), stride=(self.stride_h, self.stride_w))(x)
        #return self.layer(x)

sacmehta commented 4 years ago

and this is the corresponding segmentation code:

# ============================================
__author__ = "Sachin Mehta"
__maintainer__ = "Sachin Mehta"
# ============================================

import torch
from torch.nn import init
from nn_layers.espnet_utils import *
from nn_layers.efficient_pyramid_pool import EfficientPyrPool
from nn_layers.efficient_pt import EfficientPWConv
from model.classification.espnetv2 import EESPNet
from utilities.print_utils import *
from torch.nn import functional as F
import math

def get_sizes(height, width):
    scales = [4.0, 2.0, 1.0, 0.5, 0.25] #[2.0, 1.5, 1.0, 0.5, 0.25]
    out_sizes = []
    for scale in scales:
        h_s = int(math.ceil(height * scale))
        w_s = int(math.ceil(width * scale))
#        h_s = h_s if h_s > 5 else 5
#        w_s = w_s if w_s > 5 else 5

        out_sizes.append((h_s, w_s))

    return out_sizes

class ESPNetv2Segmentation(nn.Module):
    '''
    This class defines the ESPNetv2 architecture for the Semantic Segmenation
    '''

    def __init__(self, args, classes=21, dataset='pascal'):
        super().__init__()

        # =============================================================
        #                       BASE NETWORK
        # =============================================================
        self.base_net = EESPNet(args) #imagenet model
        del self.base_net.classifier
        del self.base_net.level5
        del self.base_net.level5_0
        config = self.base_net.config

        #=============================================================
        #                   SEGMENTATION NETWORK
        #=============================================================
        dec_feat_dict={
            'pascal': 16,
            'city': 16,
            'coco': 32
        }
        base_dec_planes = dec_feat_dict[dataset]
        dec_planes = [4*base_dec_planes, 3*base_dec_planes, 2*base_dec_planes, classes]
        pyr_plane_proj = min(classes //2, base_dec_planes)

        im_height, im_width = args.im_size

        scale_factor = 16
        height = int(im_height // scale_factor)
        width = int(im_width // scale_factor)
        out_sizes = get_sizes(height, width)
        in_sizes = (height, width)

        self.bu_dec_l1 = EfficientPyrPool(in_planes=config[3], proj_planes=pyr_plane_proj, out_planes=dec_planes[0], out_size=out_sizes, inp_size=in_sizes)

        scale_factor = 8
        height = int(im_height // scale_factor)
        width = int(im_width // scale_factor)
        out_sizes = get_sizes(height, width)
        in_sizes = (height, width)
        self.bu_dec_l2 = EfficientPyrPool(in_planes=dec_planes[0], proj_planes=pyr_plane_proj, out_planes=dec_planes[1], out_size=out_sizes, inp_size=in_sizes)
        self.merge_enc_dec_l2 = EfficientPWConv(config[2], dec_planes[0], groups=math.gcd(config[2], dec_planes[0]), inp_size=in_sizes)

        scale_factor = 4
        height = int(im_height // scale_factor)
        width = int(im_width // scale_factor)
        out_sizes = get_sizes(height, width)
        in_sizes = (height, width)
        self.bu_dec_l3 = EfficientPyrPool(in_planes=dec_planes[1], proj_planes=pyr_plane_proj,
                                          out_planes=dec_planes[2],  out_size=out_sizes, inp_size=in_sizes)
        self.merge_enc_dec_l3 = EfficientPWConv(config[1], dec_planes[1], groups=math.gcd(config[1], dec_planes[1]), inp_size=in_sizes)

        scale_factor = 2
        height = int(im_height // scale_factor)
        width = int(im_width // scale_factor)
        out_sizes = get_sizes(height, width)
        in_sizes = (height, width)
        self.bu_dec_l4 = EfficientPyrPool(in_planes=dec_planes[2], proj_planes=pyr_plane_proj,
                                          out_planes=dec_planes[3], out_size=out_sizes, inp_size=in_sizes, last_layer_br=False)
        self.merge_enc_dec_l4 = EfficientPWConv(config[0], dec_planes[2], groups=math.gcd(config[0], dec_planes[2]), inp_size=in_sizes)

        self.bu_br_l2 = nn.Sequential(nn.BatchNorm2d(dec_planes[0]),
                                      nn.PReLU(dec_planes[0])
                                      )
        self.bu_br_l3 = nn.Sequential(nn.BatchNorm2d(dec_planes[1]),
                                      nn.PReLU(dec_planes[1])
                                      )
        self.bu_br_l4 = nn.Sequential(nn.BatchNorm2d(dec_planes[2]),
                                      nn.PReLU(dec_planes[2])
                                      )
        self.init_params()

    def init_params(self):
        '''
        Function to initialze the parameters
        '''
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                init.kaiming_normal_(m.weight, mode='fan_out')
                if m.bias is not None:
                    init.constant_(m.bias, 0)
            elif isinstance(m, nn.BatchNorm2d):
                init.constant_(m.weight, 1)
                init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                init.normal_(m.weight, std=0.001)
                if m.bias is not None:
                    init.constant_(m.bias, 0)

    def get_basenet_params(self):
        modules_base = [self.base_net]
        for i in range(len(modules_base)):
            for m in modules_base[i].named_modules():
                if isinstance(m[1], nn.Conv2d) or isinstance(m[1], nn.BatchNorm2d) or isinstance(m[1], nn.PReLU):
                    for p in m[1].parameters():
                        if p.requires_grad:
                            yield p

    def get_segment_params(self):
        modules_seg = [self.bu_dec_l1, self.bu_dec_l2, self.bu_dec_l3, self.bu_dec_l4,
                       self.merge_enc_dec_l4, self.merge_enc_dec_l3, self.merge_enc_dec_l2,
                       self.bu_br_l4, self.bu_br_l3, self.bu_br_l2]
        for i in range(len(modules_seg)):
            for m in modules_seg[i].named_modules():
                if isinstance(m[1], nn.Conv2d) or isinstance(m[1], nn.BatchNorm2d) or isinstance(m[1], nn.PReLU):
                    for p in m[1].parameters():
                        if p.requires_grad:
                            yield p

    def forward(self, x):
        '''
        :param x: Receives the input RGB image
        :return: a C-dimensional vector, C=# of classes
        '''
        enc_out_l1 = self.base_net.level1(x)  # 112

        enc_out_l2 = self.base_net.level2_0(enc_out_l1, x, down_times=2)  # 56

        enc_out_l3_0 = self.base_net.level3_0(enc_out_l2, x,  down_times=3)  # down-sample
        for i, layer in enumerate(self.base_net.level3):
            if i == 0:
                enc_out_l3 = layer(enc_out_l3_0)
            else:
                enc_out_l3 = layer(enc_out_l3)

        enc_out_l4_0 = self.base_net.level4_0(enc_out_l3, x,  down_times=4)  # down-sample
        for i, layer in enumerate(self.base_net.level4):
            if i == 0:
                enc_out_l4 = layer(enc_out_l4_0)
            else:
                enc_out_l4 = layer(enc_out_l4)

        # bottom-up decoding
        bu_out = self.bu_dec_l1(enc_out_l4)

        # Decoding block
        bu_out = F.interpolate(bu_out, scale_factor=2, mode='nearest')#mode='bilinear', align_corners=False)
        enc_out_l3_proj = self.merge_enc_dec_l2(enc_out_l3)
        bu_out = enc_out_l3_proj + bu_out
        bu_out = self.bu_br_l2(bu_out)
        bu_out = self.bu_dec_l2(bu_out)

        #decoding block
        bu_out = F.interpolate(bu_out, scale_factor=2, mode='nearest')#mode='bilinear'#, align_corners=False)
        enc_out_l2_proj = self.merge_enc_dec_l3(enc_out_l2)
        bu_out = enc_out_l2_proj + bu_out
        bu_out = self.bu_br_l3(bu_out)
        bu_out = self.bu_dec_l3(bu_out)

        # decoding block
        bu_out = F.interpolate(bu_out, scale_factor=2, mode='nearest')#mode='bilinear', align_corners=False)
        enc_out_l1_proj = self.merge_enc_dec_l4(enc_out_l1)
        bu_out = enc_out_l1_proj + bu_out
        bu_out = self.bu_br_l4(bu_out)
        bu_out  = self.bu_dec_l4(bu_out)

        return F.interpolate(bu_out, scale_factor=2, mode='nearest')#mode='bilinear', align_corners=False)

def espnetv2_seg(args):
    classes = args.classes
    weights = args.weights
    dataset=args.dataset
    model = ESPNetv2Segmentation(args, classes=classes, dataset=dataset)
    if weights:
        import os
        if os.path.isfile(weights):
            num_gpus = torch.cuda.device_count()
            device = 'cuda' if num_gpus >= 1 else 'cpu'
            pretrained_dict = torch.load(weights, map_location=torch.device(device))
        else:
            print_error_message('Weight file does not exist at {}. Please check. Exiting!!'.format(weights))
            exit()
        print_info_message('Loading pretrained basenet model weights')
        basenet_dict = model.base_net.state_dict()
        model_dict = model.state_dict()
        overlap_dict = {k: v for k, v in pretrained_dict.items() if k in basenet_dict}
        if len(overlap_dict) == 0:
            print_error_message('No overlaping weights between model file and pretrained weight file. Please check')
            exit()
        print_info_message('{:.2f} % of weights copied from basenet to segnet'.format(len(overlap_dict) * 1.0/len(model_dict) * 100))
        basenet_dict.update(overlap_dict)
        model.base_net.load_state_dict(basenet_dict)
        print_info_message('Pretrained basenet model loaded!!')
    else:
        print_warning_message('Training from scratch!!')
    return model

if __name__ == "__main__":
    import torch
    import argparse

    parser = argparse.ArgumentParser(description='Testing')
    args = parser.parse_args()

    args.classes = 21
    args.s = 2.0
    args.weights='' #'../classification/model_zoo/espnet/espnetv2_s_2.0_imagenet_224x224.pth'
    args.dataset='pascal'
    args.im_size = (256, 256)

    input = torch.Tensor(1, 3, 256, 256)
    model = espnetv2_seg(args)
    weight_dict = torch.load('./model_zoo/espnetv2/espnetv2_s_2.0_pascal_256x256.pth', map_location=torch.device('cpu'))
    model.load_state_dict(weight_dict)

    out = model(input)
    print_info_message(out.size())

sacmehta commented 4 years ago

and edited eesp.py file

from torch.nn import init
import torch.nn.functional as F
from nn_layers.espnet_utils import *
import math
import torch
from model.classification import espnetv2_config as config

#============================================
__author__ = "Sachin Mehta"
__maintainer__ = "Sachin Mehta"
#============================================

config_inp_reinf = config.config_inp_reinf

class EESP(nn.Module):
    '''
    This class defines the EESP block, which is based on the following principle
        REDUCE ---> SPLIT ---> TRANSFORM --> MERGE
    '''

    def __init__(self, nIn, nOut, stride=1, k=4, r_lim=7, down_method='esp'): #down_method --> ['avg' or 'esp']
        '''
        :param nIn: number of input channels
        :param nOut: number of output channels
        :param stride: factor by which we should skip (useful for down-sampling). If 2, then down-samples the feature map by 2
        :param k: # of parallel branches
        :param r_lim: A maximum value of receptive field allowed for EESP block
        :param down_method: Downsample or not (equivalent to say stride is 2 or not)
        '''
        super().__init__()
        self.stride = stride
        n = int(nOut / k)
        n1 = nOut - (k - 1) * n
        assert down_method in ['avg', 'esp'], 'One of these is suppported (avg or esp)'
        assert n == n1, "n(={}) and n1(={}) should be equal for Depth-wise Convolution ".format(n, n1)
        self.proj_1x1 = CBR(nIn, n, 1, stride=1, groups=k)

        # (For convenience) Mapping between dilation rate and receptive field for a 3x3 kernel
        map_receptive_ksize = {3: 1, 5: 2, 7: 3, 9: 4, 11: 5, 13: 6, 15: 7, 17: 8}
        self.k_sizes = list()
        for i in range(k):
            ksize = int(3 + 2 * i)
            # After reaching the receptive field limit, fall back to the base kernel size of 3 with a dilation rate of 1
            ksize = ksize if ksize <= r_lim else 3
            self.k_sizes.append(ksize)
        # sort (in ascending order) these kernel sizes based on their receptive field
        # This enables us to ignore the kernels (3x3 in our case) with the same effective receptive field in hierarchical
        # feature fusion because kernels with 3x3 receptive fields does not have gridding artifact.
        self.k_sizes.sort()
        self.spp_dw = nn.ModuleList()
        for i in range(k):
            d_rate = map_receptive_ksize[self.k_sizes[i]]
            self.spp_dw.append(CDilated(n, n, kSize=3, stride=stride, groups=n, d=d_rate))
        # Performing a group convolution with K groups is the same as performing K point-wise convolutions
        self.conv_1x1_exp = CB(nOut, nOut, 1, 1, groups=k)
        self.br_after_cat = BR(nOut)
        self.module_act = nn.PReLU(nOut)
        self.downAvg = True if down_method == 'avg' else False

    def forward(self, input):
        '''
        :param input: input feature map
        :return: transformed feature map
        '''

        # Reduce --> project high-dimensional feature maps to low-dimensional space
        proj = self.proj_1x1(input)

        # i.e. Split --> Transform --> HFF
        branch_0 = self.spp_dw[0](proj)
        branch_1 = self.spp_dw[1](proj)
        branch_2 = self.spp_dw[2](proj)
        branch_3 = self.spp_dw[3](proj)

        # HFF
        branch_1 = branch_0 + branch_1
        branch_2 = branch_1 + branch_2
        branch_3 = branch_2 + branch_3

        expanded = self.br_after_cat(torch.cat([branch_0, branch_1, branch_2, branch_3], 1))
        expanded  = self.conv_1x1_exp(expanded)

        if self.downAvg:
            return expanded

        expanded = expanded + input
        return self.module_act(expanded)

class DownSampler(nn.Module):
    '''
    Down-sampling fucntion that has three parallel branches: (1) avg pooling,
    (2) EESP block with stride of 2 and (3) efficient long-range connection with the input.
    The output feature maps of branches from (1) and (2) are concatenated and then additively fused with (3) to produce
    the final output.
    '''

    def __init__(self, nin, nout, k=4, r_lim=9, reinf=True):
        '''
            :param nin: number of input channels
            :param nout: number of output channels
            :param k: # of parallel branches
            :param r_lim: A maximum value of receptive field allowed for EESP block
            :param reinf: Use long range shortcut connection with the input or not.
        '''
        super().__init__()
        nout_new = nout - nin
        self.eesp = EESP(nin, nout_new, stride=2, k=k, r_lim=r_lim, down_method='avg')
        self.avg = nn.AvgPool2d(kernel_size=3, padding=1, stride=2)
        if reinf:
            self.inp_reinf = nn.Sequential(
                CBR(config_inp_reinf, config_inp_reinf, 3, 1),
                CB(config_inp_reinf, nout, 1, 1)
            )
        self.act =  nn.PReLU(nout)

    def forward(self, input, input2=None, down_times=1):
        '''
        :param input: input feature map
        :return: feature map down-sampled by a factor of 2
        '''
        avg_out = self.avg(input)
        eesp_out = self.eesp(input)
        output = torch.cat([avg_out, eesp_out], 1)

        if input2 is not None:
            for i in range(down_times):
                input2 = F.avg_pool2d(input2, kernel_size=3, padding=1, stride=2)
            output = output + self.inp_reinf(input2)

        return self.act(output)

lawo123 commented 4 years ago

I just modified the efficient_pyramid_pool.py and dont modify nothing file else. `#============================================ author = "Sachin Mehta" maintainer = "Sachin Mehta"

============================================

import torch from torch import nn import math from torch.nn import functional as F from nn_layers.cnn_utils import CBR, BR, Shuffle import numpy as np class EfficientPyrPool(nn.Module): """Efficient Pyramid Pooling Module"""

def __init__(self, in_planes, proj_planes, out_planes, scales=[2.0, 1.5, 1.0, 0.5, 0.1], last_layer_br=True):
    super(EfficientPyrPool, self).__init__()
    self.stages = nn.ModuleList()
    scales.sort(reverse=True)

    self.projection_layer = CBR(in_planes, proj_planes, 1, 1)
    for _ in enumerate(scales):
        self.stages.append(nn.Conv2d(proj_planes, proj_planes, kernel_size=3, stride=1, padding=1, bias=False, groups=proj_planes))

    self.merge_layer = nn.Sequential(
        # perform one big batch normalization instead of p small ones
        BR(proj_planes * len(scales)),
        Shuffle(groups=len(scales)),
        CBR(proj_planes * len(scales), proj_planes, 3, 1, groups=proj_planes),
        nn.Conv2d(proj_planes, out_planes, kernel_size=1, stride=1, bias=not last_layer_br),
    )
    if last_layer_br:
        self.br = BR(out_planes)
    self.last_layer_br = last_layer_br
    self.scales = scales

def forward(self, x):
    hs = []
    x = self.projection_layer(x)
    height, width = x.size()[2:]
    height = int(height)
    width = int(width)
    for i, stage in enumerate(self.stages):
        h_s = int(math.ceil(height * self.scales[i]))
        w_s = int(math.ceil(width * self.scales[i]))
        h_s = h_s if h_s > 5 else 5
        w_s = w_s if w_s > 5 else 5
        if self.scales[i] < 1.0:
            input_height, input_width = x.size()[2:]
            # h = F.adaptive_avg_pool2d(x, output_size=(h_s, w_s))
            h = self.Adaptive_avg_pool2d(x, np.array([input_height, input_width]),
                                         np.array([h_s, w_s]))

            h = stage(h)
            h = F.interpolate(h, (height, width), mode='bilinear', align_corners=True)
        elif self.scales[i] > 1.0:
            h = F.interpolate(x, (h_s, w_s),mode='bilinear', align_corners=True)
            h = stage(h)
            # h = F.adaptive_avg_pool2d(h, output_size=(height, width))
            input_height, input_width = h.size()[2:]
            output_height, output_width = x.size()[2:]
            h = self.Adaptive_avg_pool2d(h,np.array([input_height,input_width]),np.array([output_height,output_width]))
            # print(h.shape)
        else:
            h = stage(x)
        hs.append(h)

    out = torch.cat(hs, dim=1)
    out = self.merge_layer(out)
    if self.last_layer_br:
        return self.br(out)
    return out

# 用标准avg_pool来实现adaptivate_avg_pool,方便移植
def Adaptive_avg_pool2d(self,x,input_size,output_size):
    strides = np.floor(input_size / output_size).astype(np.int32)
    kernels = input_size-(output_size-1)*strides
    avg = F.avg_pool2d(x,kernel_size=list(kernels), stride=list(strides))
    return avg'

pribadihcr commented 4 years ago

Hi @sacmehta,

tried convert to onnx for segmentation model. Have modified espnetv2 and eesp.py Got the following error: self.bu_dec_l1 = EfficientPyrPool(in_planes=config[3], proj_planes=pyr_plane_proj, out_planes=dec_planes[0], out_size=out_sizes, inp_size=in_sizes) TypeError: init() got an unexpected keyword argument 'out_size'

sacmehta commented 4 years ago

Please use below efficient pyramid pool method


__author__ = "Sachin Mehta"
__maintainer__ = "Sachin Mehta"
#============================================

import torch
from torch import nn
import math
from torch.nn import functional as F
from nn_layers.cnn_utils import CBR, BR, Shuffle

class Identity(nn.Module):
    def forward(self, x):
        return x

class Interpolate(nn.Module):
    def __init__(self, h, w):
        super(Interpolate, self).__init__()
        self.h = h
        self.w = w

    def forward(self, x):
        return F.upsample(x, size=(self.h, self.w), mode='bilinear', align_corners=False)
        #return F.interpolate(x, size=(self.h, self.w), mode='nearest')#mode='bilinear', align_corners=False)

class EfficientPyrPool(nn.Module):
    """Efficient Pyramid Pooling Module"""

    def __init__(self, in_planes, proj_planes, out_planes, inp_size=None, out_size=None,
                 scales=[2.0, 1.5, 1.0, 0.5, 0.1],  last_layer_br=True):
        super(EfficientPyrPool, self).__init__()
        self.stages = nn.ModuleList()
        scales.sort(reverse=True)

        self.projection_layer = CBR(in_planes, proj_planes, 1, 1)
        for _ in enumerate(scales):
            self.stages.append(nn.Conv2d(proj_planes, proj_planes, kernel_size=3, stride=1, padding=1, bias=False, groups=proj_planes))
        self.merge_layer = nn.Sequential(
            # perform one big batch normalization instead of p small ones
            BR(proj_planes * len(scales)),
            Shuffle(groups=len(scales)),
            CBR(proj_planes * len(scales), proj_planes, 3, 1, groups=proj_planes),
            nn.Conv2d(proj_planes, out_planes, kernel_size=1, stride=1, bias=not last_layer_br),
        )

        if last_layer_br:
            self.br = BR(out_planes)

        layer_scale_a = []
        layer_scale_b = []
        for i, sc in enumerate(scales):
            if sc < 1.0:
                layer_scale_a.append(AdaptivePool(input_size=inp_size, output_size=out_size[i])) #(nn.AdaptiveAvgPool2d(output_size=out_size[i]))
                layer_scale_b.append(Interpolate(inp_size[0], inp_size[1]))
            elif sc > 1.0:
                h, w = out_size[i]
                layer_scale_a.append(Interpolate(h, w))
                layer_scale_b.append(AdaptivePool(output_size=inp_size, input_size=out_size[i]))#(nn.AdaptiveAvgPool2d(output_size=inp_size))
            else:
                layer_scale_a.append(Identity())
                layer_scale_b.append(Identity())

        self.layer_scale_a = layer_scale_a
        self.layer_scale_b = layer_scale_b

        self.last_layer_br = last_layer_br
        self.scales = scales

    def forward(self, x):
        hs = []
        x = self.projection_layer(x)

        for i, stage in enumerate(self.stages):
            h = self.layer_scale_a[i](x)
            h = stage(h)
            h = self.layer_scale_b[i](h)
            hs.append(h)

        out = torch.cat(hs, dim=1)
        out = self.merge_layer(out)
        if self.last_layer_br:
            return self.br(out)
        return out

class AdaptivePool(nn.Module):
    __constants__ = ['stride_h', 'stride_w', ]
    def __init__(self, input_size, output_size):
        super(AdaptivePool, self).__init__()
        stride_h = int(math.ceil(input_size[0] / output_size[0]))
        stride_w = int(math.ceil(input_size[1] / output_size[1]))
        k_size_h = input_size[0] - (output_size[0] - 1) * stride_h
        k_size_w = input_size[1] - (output_size[1] - 1) * stride_w
        #print(stride_h, stride_w, k_size_h, k_size_w)
        self.k_size_h = k_size_h
        self.k_size_w = k_size_w
        self.stride_h = stride_h
        self.stride_w = stride_w
        #self.layer = nn.AvgPool2d(kernel_size=(k_size_h, k_size_w), stride=(stride_h, stride_w))

    def forward(self, x):
        return nn.AvgPool2d(kernel_size=(self.k_size_h, self.k_size_w), stride=(self.stride_h, self.stride_w))(x)
        #return self.layer(x)```

pribadihcr commented 4 years ago

got another error: self.merge_enc_dec_l2 = EfficientPWConv(config[2], dec_planes[0], groups=math.gcd(config[2], dec_planes[0]), inp_size=in_sizes) TypeError: init() got an unexpected keyword argument 'groups'

look like efficient_pt also need to modify

sacmehta commented 4 years ago


__author__ = "Sachin Mehta"
__maintainer__ = "Sachin Mehta"
#============================================

from torch import nn
import math
from nn_layers.cnn_utils import CBR
from nn_layers.efficient_pyramid_pool import AdaptivePool

class EfficientPWConv(nn.Module):
    def __init__(self, nin, nout, groups=64, inp_size=16):
        super(EfficientPWConv, self).__init__()
        self.wt_layer = nn.Sequential(
                        AdaptivePool(input_size=inp_size, output_size=(1, 1)),
                        #nn.AdaptiveAvgPool2d(output_size=1),
                        nn.Conv2d(nin, nout, kernel_size=1, stride=1, padding=0, groups=1, bias=False),
                        nn.Sigmoid()
                    )

        #self.groups = math.gcd(nin, nout)
        #print(self.groups, nin, nout)
        self.expansion_layer = CBR(nin, nout, kSize=3, stride=1, groups=groups)

        self.out_size = nout
        self.in_size = nin

    def forward(self, x):
        wts = self.wt_layer(x)
        x = self.expansion_layer(x)
        x = x * wts
        return x

    def __repr__(self):
        s = '{name}(in_channels={in_size}, out_channels={out_size})'
        return s.format(name=self.__class__.__name__, **self.__dict__)```

ALLUPRASAD commented 4 years ago

Thank you for sharing this great code. Right row, I want to deploy your model in to tvm platform, which may need conversion between pytorch and onnx, the code I used is like below. weights = 'model/detection/model_zoo/espnetv2/espnetv2_s_2.0_pascal_300x300.pth' model = ssd(args, cfg) pretrained_dict = torch.load(weights, map_location=torch.device('cpu')) model.load_state_dict(pretrained_dict) PATH_ONNX='deploy.onnx' dummy_input = torch.randn(1, 3, 300, 300, device='cpu') torch.onnx.export(model, dummy_input, PATH_ONNX, input_names = ['image'], output_names= ['output'], verbose=True,opset_version=11) but during the conversion, an error occurs,the info is below: ~/software/EdgeNets/nn_layers/eesp.py:139: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if w2 == w1: ~/software/EdgeNets/nn_layers/eesp.py:89: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if expanded.size() == input.size(): ~/software/EdgeNets/nn_layers/efficient_pyramid_pool.py:44: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! h_s = int(math.ceil(height self.scales[i])) ~/software/EdgeNets/nn_layers/efficient_pyramid_pool.py:45: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! w_s = int(math.ceil(width self.scales[i])) raise RuntimeError("Failed to export an ONNX attribute, " RuntimeError: Failed to export an ONNX attribute, since it's not constant, please try to make things (e.g., kernel size) static if possible please give some tips, which I can figure out the problem. thank you for your help!

I encountered the same problem。EdgeNets/nn_layers/efficient_pyramid_pool.py h = F.adaptive_avg_pool2d(h, output_size=(height, width)) changes :h = F.adaptive_avg_pool2d(h, output_size=((int)height,(int) width)) and your code : output_names= ['output'], verbose=True,opset_version=11) changes output_names= ['output'], verbose=True,opset_version=11,operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK) .it works. But onnx to other models,F.adaptive_avg_pool2d will occur some problems.you can try repleace it with avg_pool2d.

ALLUPRASAD commented 4 years ago

Hi I am getting this warning ..while executing above code TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! h = self.Adaptive_avg_pool2d(x, np.array([input_height, input_width]),

sacmehta commented 4 years ago

Use this version of Adaptive pooling

class AdaptivePool(nn.Module):
    __constants__ = ['stride_h', 'stride_w', ]
    def __init__(self, input_size, output_size):
        super(AdaptivePool, self).__init__()
        stride_h = int(math.ceil(input_size[0] / output_size[0]))
        stride_w = int(math.ceil(input_size[1] / output_size[1]))
        k_size_h = input_size[0] - (output_size[0] - 1) * stride_h
        k_size_w = input_size[1] - (output_size[1] - 1) * stride_w
        #print(stride_h, stride_w, k_size_h, k_size_w)
        self.k_size_h = k_size_h
        self.k_size_w = k_size_w
        self.stride_h = stride_h
        self.stride_w = stride_w
        #self.layer = nn.AvgPool2d(kernel_size=(k_size_h, k_size_w), stride=(stride_h, stride_w))

    def forward(self, x):
        return nn.AvgPool2d(kernel_size=(self.k_size_h, self.k_size_w), stride=(self.stride_h, self.stride_w))(x)
        #return self.layer(x)

ALLUPRASAD commented 4 years ago

Hi sacmehta..while using this version of code ..I am getting like
EdgeNets/nn_layers/efficient_pyramid_pool.py", line 53, in init h, w = out_size[i] TypeError: 'NoneType' object is not subscriptable

sacmehta commented 4 years ago

You need to use the code that is copied in this thread and not in the repo.

ALLUPRASAD commented 4 years ago

I copied in this tread only..

import torch from torch import nn import math from torch.nn import functional as F from nn_layers.cnn_utils import CBR, BR, Shuffle

class Identity(nn.Module): def forward(self, x): return x

class Interpolate(nn.Module): def init(self, h, w): super(Interpolate, self).init() self.h = h self.w = w

def forward(self, x):
    return F.upsample(x, size=(self.h, self.w), mode='bilinear', align_corners=False)
    #return F.interpolate(x, size=(self.h, self.w), mode='nearest')#mode='bilinear', align_corners=False)

class EfficientPyrPool(nn.Module): """Efficient Pyramid Pooling Module"""

def __init__(self, in_planes, proj_planes, out_planes, inp_size=None, out_size=None,
             scales=[2.0, 1.5, 1.0, 0.5, 0.1],  last_layer_br=True):
    super(EfficientPyrPool, self).__init__()
    self.stages = nn.ModuleList()
    scales.sort(reverse=True)

    self.projection_layer = CBR(in_planes, proj_planes, 1, 1)
    for _ in enumerate(scales):
        self.stages.append(nn.Conv2d(proj_planes, proj_planes, kernel_size=3, stride=1, padding=1, bias=False, groups=proj_planes))
    self.merge_layer = nn.Sequential(
        # perform one big batch normalization instead of p small ones
        BR(proj_planes * len(scales)),
        Shuffle(groups=len(scales)),
        CBR(proj_planes * len(scales), proj_planes, 3, 1, groups=proj_planes),
        nn.Conv2d(proj_planes, out_planes, kernel_size=1, stride=1, bias=not last_layer_br),
    )

    if last_layer_br:
        self.br = BR(out_planes)

    layer_scale_a = []
    layer_scale_b = []
    for i, sc in enumerate(scales):
        if sc < 1.0:
            layer_scale_a.append(AdaptivePool(input_size=inp_size, output_size=out_size[i])) #(nn.AdaptiveAvgPool2d(output_size=out_size[i]))
            layer_scale_b.append(Interpolate(inp_size[0], inp_size[1]))
        elif sc > 1.0:
            h, w = out_size[i]
            layer_scale_a.append(Interpolate(h, w))
            layer_scale_b.append(AdaptivePool(output_size=inp_size, input_size=out_size[i]))#(nn.AdaptiveAvgPool2d(output_size=inp_size))
        else:
            layer_scale_a.append(Identity())
            layer_scale_b.append(Identity())

    self.layer_scale_a = layer_scale_a
    self.layer_scale_b = layer_scale_b

    self.last_layer_br = last_layer_br
    self.scales = scales

def forward(self, x):
    hs = []
    x = self.projection_layer(x)

    for i, stage in enumerate(self.stages):
        h = self.layer_scale_a[i](x)
        h = stage(h)
        h = self.layer_scale_b[i](h)
        hs.append(h)

    out = torch.cat(hs, dim=1)
    out = self.merge_layer(out)
    if self.last_layer_br:
        return self.br(out)
    return out

class AdaptivePool(nn.Module): constants = ['stride_h', 'stride_w', ] def init(self, input_size, output_size): super(AdaptivePool, self).init() stride_h = int(math.ceil(input_size[0] / output_size[0])) stride_w = int(math.ceil(input_size[1] / output_size[1])) k_size_h = input_size[0] - (output_size[0] - 1) stride_h k_size_w = input_size[1] - (output_size[1] - 1) stride_w

print(stride_h, stride_w, k_size_h, k_size_w)

    self.k_size_h = k_size_h
    self.k_size_w = k_size_w
    self.stride_h = stride_h
    self.stride_w = stride_w
    #self.layer = nn.AvgPool2d(kernel_size=(k_size_h, k_size_w), stride=(stride_h, stride_w))

def forward(self, x):
    return nn.AvgPool2d(kernel_size=(self.k_size_h, self.k_size_w), stride=(self.stride_h, self.stride_w))(x)
    #return self.layer(x)

sacmehta commented 4 years ago

Are you passing out_size to EfficientPyrPool?

You need to make changes to segmentation model so that you can pass these arguments. Something like below code:

# ============================================
__author__ = "Sachin Mehta"
__maintainer__ = "Sachin Mehta"
# ============================================

import torch
from torch.nn import init
from nn_layers.espnet_utils import *
from nn_layers.efficient_pyramid_pool import EfficientPyrPool
from nn_layers.efficient_pt import EfficientPWConv
from model.classification.espnetv2 import EESPNet
from utilities.print_utils import *
from torch.nn import functional as F
import math

def get_sizes(height, width):
    scales = [4.0, 2.0, 1.0, 0.5, 0.25] #[2.0, 1.5, 1.0, 0.5, 0.25]
    out_sizes = []
    for scale in scales:
        h_s = int(math.ceil(height * scale))
        w_s = int(math.ceil(width * scale))
#        h_s = h_s if h_s > 5 else 5
#        w_s = w_s if w_s > 5 else 5

        out_sizes.append((h_s, w_s))

    return out_sizes

class ESPNetv2Segmentation(nn.Module):
    '''
    This class defines the ESPNetv2 architecture for the Semantic Segmenation
    '''

    def __init__(self, args, classes=21, dataset='pascal'):
        super().__init__()

        # =============================================================
        #                       BASE NETWORK
        # =============================================================
        self.base_net = EESPNet(args) #imagenet model
        del self.base_net.classifier
        del self.base_net.level5
        del self.base_net.level5_0
        config = self.base_net.config

        #=============================================================
        #                   SEGMENTATION NETWORK
        #=============================================================
        dec_feat_dict={
            'pascal': 16,
            'city': 16,
            'coco': 32
        }
        base_dec_planes = dec_feat_dict[dataset]
        dec_planes = [4*base_dec_planes, 3*base_dec_planes, 2*base_dec_planes, classes]
        pyr_plane_proj = min(classes //2, base_dec_planes)

        im_height, im_width = args.im_size

        scale_factor = 16
        height = int(im_height // scale_factor)
        width = int(im_width // scale_factor)
        out_sizes = get_sizes(height, width)
        in_sizes = (height, width)

        self.bu_dec_l1 = EfficientPyrPool(in_planes=config[3], proj_planes=pyr_plane_proj, out_planes=dec_planes[0], out_size=out_sizes, inp_size=in_sizes)

        scale_factor = 8
        height = int(im_height // scale_factor)
        width = int(im_width // scale_factor)
        out_sizes = get_sizes(height, width)
        in_sizes = (height, width)
        self.bu_dec_l2 = EfficientPyrPool(in_planes=dec_planes[0], proj_planes=pyr_plane_proj, out_planes=dec_planes[1], out_size=out_sizes, inp_size=in_sizes)
        self.merge_enc_dec_l2 = EfficientPWConv(config[2], dec_planes[0], groups=math.gcd(config[2], dec_planes[0]), inp_size=in_sizes)

        scale_factor = 4
        height = int(im_height // scale_factor)
        width = int(im_width // scale_factor)
        out_sizes = get_sizes(height, width)
        in_sizes = (height, width)
        self.bu_dec_l3 = EfficientPyrPool(in_planes=dec_planes[1], proj_planes=pyr_plane_proj,
                                          out_planes=dec_planes[2],  out_size=out_sizes, inp_size=in_sizes)
        self.merge_enc_dec_l3 = EfficientPWConv(config[1], dec_planes[1], groups=math.gcd(config[1], dec_planes[1]), inp_size=in_sizes)

        scale_factor = 2
        height = int(im_height // scale_factor)
        width = int(im_width // scale_factor)
        out_sizes = get_sizes(height, width)
        in_sizes = (height, width)
        self.bu_dec_l4 = EfficientPyrPool(in_planes=dec_planes[2], proj_planes=pyr_plane_proj,
                                          out_planes=dec_planes[3], out_size=out_sizes, inp_size=in_sizes, last_layer_br=False)
        self.merge_enc_dec_l4 = EfficientPWConv(config[0], dec_planes[2], groups=math.gcd(config[0], dec_planes[2]), inp_size=in_sizes)

        self.bu_br_l2 = nn.Sequential(nn.BatchNorm2d(dec_planes[0]),
                                      nn.PReLU(dec_planes[0])
                                      )
        self.bu_br_l3 = nn.Sequential(nn.BatchNorm2d(dec_planes[1]),
                                      nn.PReLU(dec_planes[1])
                                      )
        self.bu_br_l4 = nn.Sequential(nn.BatchNorm2d(dec_planes[2]),
                                      nn.PReLU(dec_planes[2])
                                      )
        self.init_params()

    def init_params(self):
        '''
        Function to initialze the parameters
        '''
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                init.kaiming_normal_(m.weight, mode='fan_out')
                if m.bias is not None:
                    init.constant_(m.bias, 0)
            elif isinstance(m, nn.BatchNorm2d):
                init.constant_(m.weight, 1)
                init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                init.normal_(m.weight, std=0.001)
                if m.bias is not None:
                    init.constant_(m.bias, 0)

    def get_basenet_params(self):
        modules_base = [self.base_net]
        for i in range(len(modules_base)):
            for m in modules_base[i].named_modules():
                if isinstance(m[1], nn.Conv2d) or isinstance(m[1], nn.BatchNorm2d) or isinstance(m[1], nn.PReLU):
                    for p in m[1].parameters():
                        if p.requires_grad:
                            yield p

    def get_segment_params(self):
        modules_seg = [self.bu_dec_l1, self.bu_dec_l2, self.bu_dec_l3, self.bu_dec_l4,
                       self.merge_enc_dec_l4, self.merge_enc_dec_l3, self.merge_enc_dec_l2,
                       self.bu_br_l4, self.bu_br_l3, self.bu_br_l2]
        for i in range(len(modules_seg)):
            for m in modules_seg[i].named_modules():
                if isinstance(m[1], nn.Conv2d) or isinstance(m[1], nn.BatchNorm2d) or isinstance(m[1], nn.PReLU):
                    for p in m[1].parameters():
                        if p.requires_grad:
                            yield p

    def forward(self, x):
        '''
        :param x: Receives the input RGB image
        :return: a C-dimensional vector, C=# of classes
        '''
        enc_out_l1 = self.base_net.level1(x)  # 112

        enc_out_l2 = self.base_net.level2_0(enc_out_l1, x, down_times=2)  # 56

        enc_out_l3_0 = self.base_net.level3_0(enc_out_l2, x,  down_times=3)  # down-sample
        for i, layer in enumerate(self.base_net.level3):
            if i == 0:
                enc_out_l3 = layer(enc_out_l3_0)
            else:
                enc_out_l3 = layer(enc_out_l3)

        enc_out_l4_0 = self.base_net.level4_0(enc_out_l3, x,  down_times=4)  # down-sample
        for i, layer in enumerate(self.base_net.level4):
            if i == 0:
                enc_out_l4 = layer(enc_out_l4_0)
            else:
                enc_out_l4 = layer(enc_out_l4)

        # bottom-up decoding
        bu_out = self.bu_dec_l1(enc_out_l4)

        # Decoding block
        bu_out = F.interpolate(bu_out, scale_factor=2, mode='nearest')#mode='bilinear', align_corners=False)
        enc_out_l3_proj = self.merge_enc_dec_l2(enc_out_l3)
        bu_out = enc_out_l3_proj + bu_out
        bu_out = self.bu_br_l2(bu_out)
        bu_out = self.bu_dec_l2(bu_out)

        #decoding block
        bu_out = F.interpolate(bu_out, scale_factor=2, mode='nearest')#mode='bilinear'#, align_corners=False)
        enc_out_l2_proj = self.merge_enc_dec_l3(enc_out_l2)
        bu_out = enc_out_l2_proj + bu_out
        bu_out = self.bu_br_l3(bu_out)
        bu_out = self.bu_dec_l3(bu_out)

        # decoding block
        bu_out = F.interpolate(bu_out, scale_factor=2, mode='nearest')#mode='bilinear', align_corners=False)
        enc_out_l1_proj = self.merge_enc_dec_l4(enc_out_l1)
        bu_out = enc_out_l1_proj + bu_out
        bu_out = self.bu_br_l4(bu_out)
        bu_out  = self.bu_dec_l4(bu_out)

        return F.interpolate(bu_out, scale_factor=2, mode='nearest')#mode='bilinear', align_corners=False)

def espnetv2_seg(args):
    classes = args.classes
    weights = args.weights
    dataset=args.dataset
    model = ESPNetv2Segmentation(args, classes=classes, dataset=dataset)
    if weights:
        import os
        if os.path.isfile(weights):
            num_gpus = torch.cuda.device_count()
            device = 'cuda' if num_gpus >= 1 else 'cpu'
            pretrained_dict = torch.load(weights, map_location=torch.device(device))
        else:
            print_error_message('Weight file does not exist at {}. Please check. Exiting!!'.format(weights))
            exit()
        print_info_message('Loading pretrained basenet model weights')
        basenet_dict = model.base_net.state_dict()
        model_dict = model.state_dict()
        overlap_dict = {k: v for k, v in pretrained_dict.items() if k in basenet_dict}
        if len(overlap_dict) == 0:
            print_error_message('No overlaping weights between model file and pretrained weight file. Please check')
            exit()
        print_info_message('{:.2f} % of weights copied from basenet to segnet'.format(len(overlap_dict) * 1.0/len(model_dict) * 100))
        basenet_dict.update(overlap_dict)
        model.base_net.load_state_dict(basenet_dict)
        print_info_message('Pretrained basenet model loaded!!')
    else:
        print_warning_message('Training from scratch!!')
    return model

if __name__ == "__main__":
    import torch
    import argparse

    parser = argparse.ArgumentParser(description='Testing')
    args = parser.parse_args()

    args.classes = 21
    args.s = 2.0
    args.weights='' #'../classification/model_zoo/espnet/espnetv2_s_2.0_imagenet_224x224.pth'
    args.dataset='pascal'
    args.im_size = (256, 256)

    input = torch.Tensor(1, 3, 256, 256)
    model = espnetv2_seg(args)
    weight_dict = torch.load('./model_zoo/espnetv2/espnetv2_s_2.0_pascal_256x256.pth', map_location=torch.device('cpu'))
    model.load_state_dict(weight_dict)

    out = model(input)
    print_info_message(out.size())

sacmehta commented 4 years ago

See the arguments of EfficientPyramid module:

self.bu_dec_l4 = EfficientPyrPool(in_planes=dec_planes[2], proj_planes=pyr_plane_proj,
                                          out_planes=dec_planes[3], out_size=out_sizes, inp_size=in_sizes, last_layer_br=False)