Open ssbusc1 opened 5 years ago
Hi @ssbusc1 , thanks for submitting this issue. In mxnet backend, we have to override Keras Model and use MXNet Moduel under the hood. So the above code does not tranfer the weights from model to wrapping_model. You have to copy the weights over.
You can do that by either: 1) save or load the weights if the two models have the same structure
model.save_weights('weights.h5')
wrapping_model = Model(inputs=model.inputs, outputs=model.outputs)
wrapping_model.load_weights('weights.h5')
2) use layer.get_weights()
and layer.set_weights()
for specific layers to want the weights to be copied.
wrapping_model = Model(inputs=model.inputs, outputs=model.outputs)
for layer, wrapped_layer in zip(model.layers, wrapping_model.layers):
print(layer.name)
print(wrapped_layer.name)
weights = layer.get_weights()
wrapped_layer.set_weights(weights)
print(wrapping_model.predict(x[0:10]))
wrapping_model.summary()
This will produce the same prediction as original model
[[9.9909782e-01 9.0223132e-04]
[1.6969813e-03 9.9830294e-01]
[9.9909782e-01 9.0223132e-04]
[1.6969813e-03 9.9830294e-01]
[9.9909782e-01 9.0223132e-04]
[1.6969813e-03 9.9830294e-01]
[9.9909782e-01 9.0223132e-04]
[1.6969813e-03 9.9830294e-01]
[9.9909782e-01 9.0223132e-04]
[1.6969813e-03 9.9830294e-01]]
Thanks. The wrapping_model will eventually have a different structure. I'll try approach #2 above. First class support for this will definitely help (as the other backends already support this) as the composition gets more involved.
I would like to stack together different models similar to what is described here: https://stackoverflow.com/questions/50092589/how-to-vertically-stack-trained-models-in-keras
This does not seem to work with the MXNet backend. Specifically, the simpler version of wrapping one model within another also does not work. I've included some sample code below that works with the Theano backend, but does not work with the MXNet backend.
I'm on keras-mxnet 2.2.4.1 installed via pip.
Example code below.
With MXNet, the predictions with the original model seem fine, but with the wrapped model, the predictions are just random. With Theano, the results are identical to the predictions generated by the original model.