microsoft / MMdnn

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
MIT License
5.8k stars 965 forks source link

Converting ResNet50 MXNet -> Keras : Bad Accuracy #684

Open CarlosE97 opened 5 years ago

CarlosE97 commented 5 years ago

Hello @rainLiuplus @JiahaoYao, after converting the ResNet50 of the MXnet model zoo to the Keras .h5 format, I have noticed that the accuracy after conversion is not good. I have tried the mxnet inference, and the accuracy is perfect. But the Keras inference is not: (Please check the "pred", the "prob" and the "logit") MXnet model zoo to Keras ResNet50-loaded_model

Furthermore, after converting the ResNet50 of the GluonCV model zoo to Keras, the probability is not even between 0 and 1, as such there is no softmax layer. Although in the .json file of the model there was no Softmax in its end: MXnet GluonCV zoo to Keras ResNet50-loaded_model

How can this inaccuracy or lack of softmax be fixed?

Thank you.

JiahaoYao commented 5 years ago

Hi @CarlosE97,

First, if I understand correctly, the conversion from MXNet to Keras only contains the parts with parameters. Thus, the output is the logit, and if you want the probability, you need to add the softmax at the end.

Second, do you mean the accuracy is not good? How do you find that because the graph you put here does not have the same input?

CarlosE97 commented 5 years ago

Hello @JiahaoYao, Thanks for the reply.

1) > First, if I understand correctly, the conversion from MXNet to Keras only contains the parts with parameters. Thus, the output is the logit, and if you want the probability, you need to add the softmax at the end.

The ResNet50 from MXnet model zoo to Keras contains a softmax. The one that doesnt have a softmax is the ResNet50 from GluonCV to Keras. If I may ask, as I am new to this field, how exactly can I add a softmax layer to my architecture?

2) > Second, do you mean the accuracy is not good? How do you find that because the graph you put here does not have the same input?

Despite the input being different, I put 2 examples so we can check the predictions with their probabilities, the first one should give "Tank", it gave "Corkscrew", the second one should have given "Jersey" but it gave "Strainer" instead.

As I've followed MMdnn's steps of conversion and had the encountered problems dealt with, the successful conversion to Keras of ResNet50 gives an inaccurate model. Should I mention that I'm using the keras model using Keras' method "load_model". Am I doing anything wrong ? Thanks again