microsoft / MMdnn

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
MIT License
5.78k stars 968 forks source link

mxnet model to pytorch odd output shapes compared with original mxnet implementation #908

Open jamesrobertwilliams opened 3 years ago

jamesrobertwilliams commented 3 years ago

Platform (like ubuntu 16.04/win10): Debian Python version: 3.7 Source framework with version (like Tensorflow 1.4.1 with GPU): MXNET Destination framework with version (like CNTK 2.3 with GPU): Pytorch Pre-trained model path (webpath or webdisk path): https://www.dropbox.com/s/53ftnlarhyrpkg2/retinaface-R50.zip?dl=0 Running scripts:

I have tried to convert a maxnet model from https://www.dropbox.com/s/53ftnlarhyrpkg2/retinaface-R50.zip?dl=0 to pytorch via the command (the mxnet model is based on resnet50):

mmtoir -f mxnet -n R50-symbol.json -w R50-0000.params -d resnet50 --inputShape 3,112,112 &&\
mmtocode -f pytorch -n resnet50.pb --IRWeightPath resnet50.npy --dstModelPath pytorch_retinaface_resnet50.py -dw pytorch_retinaface_resnet50.npy

I have the resultant model, but when I run the input through it, it gives me a radically different output shape on the pytorch model, which makes no sense :(

When you see the original mxnet inference code (see example here: http://insightface.ai/build/examples_face_detection/demo_retinaface.html) and when you exec this:

bbox, landmark = model.detect(img, threshold=0.5, scale=1.0)

bbox has a dim of (6, 5) and landmarks has s shape of (6, 5, 2).

however, when I use the converted pytorch model and run a forwrad on it, I get bbox shape: (256, 28, 28) and landmarks shape (256, 28, 28).

I am not able to pinpoint exactly where this could be going wrong :(

cookieli commented 3 years ago

Sorry for replying your issues too late and thank you for your issues. It's may be our implementation faults and we will fix it soon.