microsoft / MMdnn

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
MIT License
5.78k stars 968 forks source link

mxnet to pytorch #862

Open johnjwatson opened 3 years ago

johnjwatson commented 3 years ago

Platform (like ubuntu 16.04/win10): debian buster

Python version: 3.6

Source framework with version (like Tensorflow 1.4.1 with GPU): mxnet

Destination framework with version (like CNTK 2.3 with GPU): pytorch

Pre-trained model path (webpath or webdisk path): https://www.dropbox.com/s/akxeqp99jvsd6z7/model-MobileFaceNet-arcface-ms1m-refine-v1.zip?dl=0

I am trying to convert a pretrained model from mxnet to pytorch, but it always seems to fail. So, first I download, unzip the model files and run:

mmconvert -sf mxnet -in model-symbol.json -iw model-0000.params -df pytorch -om pytorch.pth --inputShape 3,112,112 and I get:

weight = self.weight_data.get(source_node.name + "_weight").asnumpy().transpose((1, 0))
AttributeError: 'NoneType' object has no attribute 'asnumpy'

which is the issue described here: https://github.com/microsoft/MMdnn/issues/231

so, I changed the line 408 in mxnet_parser.py to: weight = self.weight_data.get("fc1_weight").asnumpy().transpose((1, 0))

Now, I run again:

mmconvert -sf mxnet -in model-symbol.json -iw model-0000.params -df pytorch -om pytorch.pth --inputShape 3,112,112 and I get:

  File "pytorch.py", line 30, in __init__
    self.conv_2_dw_conv2d = self.__conv(2, name='conv_2_dw_conv2d', in_channels=64, out_channels=4096, kernel_size=(3, 3), stride=(1, 1), groups=64, bias=False)
  File "pytorch.py", line 335, in __conv
    layer.state_dict()['weight'].copy_(torch.from_numpy(__weights_dict[name]['weights']))
RuntimeError: The size of tensor a (4096) must match the size of tensor b (64) at non-singleton dimension 0

I am not sure what it is trying to tell me other than that there seems a size mismatch. I was wondering if anyone has encountered this and have a solution for this?

Also, I get a warning during the conversion:

 UserWarning: You created Module with Module(..., label_names=['softmax_label']) but input with name 'softmax_label' is not found in symbol.list_arguments(). Did you mean one of:
    data
  warnings.warn(msg)

Is this error something I can safely ignore? Sorry, I am VERY new to MXNET.

XiaoXYe commented 3 years ago

Hi @johnjwatson , you can change the code in pytorch_parser.py: Replace this part in emit_Conv function

if IR_node.type == 'DepthwiseConv':
        group = in_channels
        filter *= group

to

if IR_node.type == 'DepthwiseConv':
        group = in_channels
        filter = group

And you can ignore that warning.

johnjwatson commented 3 years ago

OMG @XiaoXYe that solved it - well, I have the model pth and the .py file - Thanks a tonne!!!!

johnjwatson commented 3 years ago

@XiaoXYe I have a follow up question. So, I have the pytorch.py pytorch file and pytorch.pth fileand I am trying to test this but I get:

  File "pytorch.py", line 130, in forward
    self.minusscalar0_second = torch.autograd.Variable(torch.from_numpy(__weights_dict['minusscalar0_second']['value']), requires_grad=False)
NameError: name '_KitModel__weights_dict' is not defined

Would you know how to solve this? I cant seem to find any answers online. :(

XiaoXYe commented 3 years ago

@johnjwatson this __weights_dict should be auto defined in KitModel.__init__() in pytorch.py like this:

class KitModel(nn.Module):

    def __init__(self, weight_file):
        super(KitModel, self).__init__()
        global __weights_dict
        __weights_dict = load_weights(weight_file)
       ...
johnjwatson commented 3 years ago

@XiaoXYe I thought the same as you and I see that it is defined this way:

import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import math

__weights_dict = dict()

def load_weights(weight_file):
    if weight_file == None:
        return

    try:
        weights_dict = np.load(weight_file, allow_pickle=True).item()
    except:
        weights_dict = np.load(weight_file, allow_pickle=True, encoding='bytes').item()

    return weights_dict

class KitModel(nn.Module):

    def __init__(self, weight_file):
        super(KitModel, self).__init__()
        global __weights_dict
        __weights_dict = load_weights(weight_file)

        self.conv_1_conv2d = self.__conv(2, name='conv_1_conv2d', in_channels=3, out_channels=64, kernel_size=(3, 3), stride=(2, 2), groups=1, bias=False)
        self.conv_1_batchnorm = self.__batch_normalization(2, 'conv_1_batchnorm', num_features=64, eps=0.0010000000474974513, momentum=0.8999999761581421)
...

but, when I run this on the files (as per the doc):

import torch
import imp
import numpy as np
MainModel = imp.load_source('MainModel', "pytorch.py")

the_model = torch.load("pytorch.pth")
the_model.eval()
print(the_model)

x = np.random.random([112,112,3])
x = np.transpose(x, (2, 0, 1))
print(x.shape)
x = np.expand_dims(x, 0).copy()
print(x.shape)
data = torch.from_numpy(x)
data = torch.autograd.Variable(data, requires_grad = False).float()

predict = the_model(data)
print(predict)

I get:

  File "/home/foo/ve_name/env_name/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "pytorch.py", line 130, in forward
    self.minusscalar0_second = torch.autograd.Variable(torch.from_numpy(__weights_dict['minusscalar0_second']['value']), requires_grad=False)
NameError: name '_KitModel__weights_dict' is not defined

Just to confirm(In case you want to replicate it), I simply download the file from the link, unzip and then do(as per the doc):

mmconvert -sf mxnet -in model-symbol.json -iw model-0000.params -df pytorch -om pytorch.pth --inputShape 3,112,112

.. to generate the the pytorch related files. The resulting full pytorch.py file is here: https://zerobin.net/?a8436f2ae6791499#dhZsFWXc91YpvlHajIqLY74MdeP8pE98E3IELiAD3bw=

johnjwatson commented 3 years ago

@XiaoXYe I think there is a bug with the double underscores of __weights_dict. Due to python name mangling (please see: https://stackoverflow.com/questions/62810436/global-variable-although-defined-errors-out-as-not-defined-in-python), it should be a single underscore. So, when I change it to a single undercsore, the above error dissapears, but now I get:

  File "pytorch.py", line 130, in forward
    self.minusscalar0_second = torch.autograd.Variable(torch.from_numpy(_weights_dict['minusscalar0_second']['value']), requires_grad=False)
KeyError: 'minusscalar0_second'

:(

XiaoXYe commented 3 years ago

@johnjwatson thank you, #863 will fix this bug. I think this maybe related to the way of loading model in pytorch and _weightsdict is empty because it is set in \_init__() and called in forward() when I save and load model and weight_file mauallly, there is no error.

import torch
import imp
import numpy as np
from pytorch import KitModel

model = KitModel("b411e5ef479b4e45b556c879a61f6704.npy")
model.eval()
print(model)

torch.save(model, "pytorch.pth")
the_model = torch.load("pytorch.pth")
x = np.random.random([112,112,3])
x = np.transpose(x, (2, 0, 1))
print(x.shape)
x = np.expand_dims(x, 0).copy()
print(x.shape)
data = torch.from_numpy(x)
data = torch.autograd.Variable(data, requires_grad = False).float()

predict = the_model(data)
print(predict)

this weight_file was deleted by MMdnn in _scipt/convert.py line 113: remove_temp_files(temp_filename) so deleted it and you can get the weight_file and load manually like above We will look into this later

johnjwatson commented 3 years ago

@XiaoXYe yea, the guy on stack (maxfischer) was kind enough to make the PR. I tried your steps and yes, it works!!! many thanks for being so responsive. REALLY appreciate it. ps: great tool btw :)