microsoft / MMdnn

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
MIT License
5.8k stars 965 forks source link

Help convert a VGG 16 model from keras to pytorch #902

Open higherdefender opened 4 years ago

higherdefender commented 4 years ago

Hello! I’m very new to machine learning and I’m trying to convert a trained VGG-16 based model (with a modified fc layer) from keras to pytorch. Can MMdnn help with this? If yes, how should I go about it? I tried for some time to create some commands myself but none worked.

If not, is there any other way I can copy weights from keras VGG to pytorch VGG? Can someone guide me as I really feel out of my depths here?

Thank you really for your help!

cookieli commented 3 years ago

Yes, that's what mmdnn does. Maybe you should look at intros in our repo.

JoeBlair commented 3 years ago

Just to hijack this thread! I have also done this (@higherdefender if you didnt manage i can advise)

however, I get different outputs from the keras network and the torch network. Is this normal?

higherdefender commented 3 years ago

Thanks! Yeah, I think the output will be slightly different. But is the overall prediction the same?

On Wed, Jun 16, 2021 at 11:12 PM Joe @.***> wrote:

Just to hijack this thread! I have also done this @.*** https://github.com/higherdefender if you didnt manage i can advise)

however, I get different outputs from the keras network and the torch network. Is this normal?

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/microsoft/MMdnn/issues/902#issuecomment-862957830, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACYY6L7DGHXRUCOVNBKMKMLTTGG6PANCNFSM4SOEPY3A .

-- dhruvjain.info | twitter.com/dj_hci

JoeBlair commented 3 years ago

No the predictions differ between the two models on the same data (with same normalisation). Not only are the probabilities different but the argmax too. I can see that the convolutional layers have the same weights, however the fully connected are different which i find strange - maybe i'm missing something.