Closed tz-hmc closed 6 years ago
I got a similar problem and achieve to got several solutions: https://github.com/apache/incubator-mxnet/issues/7530 With the Gluon API it's easy and straight forward, with module API that something else :( I put my test here: https://github.com/edmBernard/mxnet_example_shared_weight Readme describe if it's work or not
Wow, I didn't know that API existed. I had a lot of trouble trying to make it work with the module API but the Gluon API looks super promising, thanks for sharing :)
However, though I'll definitely test Gluon out, do you know how I would do this with the Module API?
Can I extract the each layer's functionality somehow and set the weights to the same variable as its identical layer in the other network? If it's too big a hassle, I guess I would use Gluon, though all the other code I have uses the Module API.
If you have exactly the same network two time, it might be possible to use shared_module
in bind function. it's use in RNN to duplicate network. I was not able to use it, as my two networks were not exactly the same. here
In my opinion, It will be easier to switch to Gluon and you will be sure it will work.
More you can use in Gluon your network define with symbol API. here (I don't have test it)
Hey again,
I tried something like this, but I still have a lot of questions:
sym1, arg_params, aux_params = get_model()
sym2, arg_params, aux_params = get_model()
mod1 = mx.mod.Module(symbol=sym1, context=mx.cpu(), label_names=None)
mod2 = mx.mod.Module(symbol=sym2, context=mx.cpu(), label_names=None)
mod1.bind(for_training=True, shared_module=mod2, data_shapes=[('data', (1,3,224,224))], # true to train
label_shapes=mod1._label_shapes)
mod2.bind(for_training=True, shared_module=mod1, data_shapes=[('data', (1,3,224,224))], # true to train
label_shapes=mod2._label_shapes)
mod1.set_params(arg_params, aux_params, allow_missing=True)
mod2.set_params(arg_params, aux_params, allow_missing=True)
out1 = sym1.get_internals()['flatten0_output']
out2 = sym2.get_internals()['flatten0_output']
siamese_out = mx.sym.Concat(out1, out2, dim=0)
# Example stacked network after it
fc1 = mx.symbol.FullyConnected(data = siamese_out, name='fc1', num_hidden=128)
act1 = mx.symbol.Activation(data = fc1, name='relu1', act_type="relu")
fc2 = mx.symbol.FullyConnected(data = act1, name = 'fc2', num_hidden = 64)
act2 = mx.symbol.Activation(data = fc2, name='relu2', act_type="relu")
fc3 = mx.symbol.FullyConnected(data = act2, name='fc3', num_hidden=num_classes)
mlp = mx.symbol.SoftmaxOutput(data = fc3, name = 'softmax')
# new_args = dict()
mod3 = mx.mod.Module(fc1, context=mx.cpu(), label_names=None)
mod3 = fe_mod.bind(for_training=False, data_shapes=[('data', (1,3,224,224))])
mod3.set_params(arg_params, aux_params)
I only want the first part of this network (layers attached to mod2 & mod1) to be shared. Would something like this work & still backpropagate errors appropriately when fitted?
Having to run mod.fit on each part of the network could be inconvenient. Is there a way around this?
I don't test shared_module
in something similar to you application. (Are you sure you don't want to use Gluon ?) :)
I don't test your code but some corrections :
# you don't need `shared_module=mod2`
mod1.bind(for_training=True, shared_module=mod2, data_shapes=[('data', (1,3,224,224))], label_shapes=mod1._label_shapes)
If you want to train everythings as one network, you need to define a new Data Iterator that is able to pass two different image in you network.
Maybe it's easier to try this example of triplet loss network (I don't test if it work)
here an example using Gluon
Wow. Thank you so much. Alright, this gives me a lot to think about. I'm really grateful for your help, thanks a ton.
If you want to share weights across the network, why not just use one copy of the network and run it twice with the inputs?
final_net(nd.concat(shared_net(x), shared_net(x)))
Also, I definitely recommend using Gluon instead of pure MxNet
@apache/mxnet-committers: This issue has been inactive for the past 90 days. It has no label and needs triage.
For general "how-to" questions, our user forum (and Chinese version) is a good place to get help.
@tz-hmc, Hope your question has been answered. For general "how-to" questions, our user forum (and Chinese version) is a good place to get help.
Description
How do I ensure the weights are kept the same? Can I unpack the internal layers somehow and set the weights of each to the same variable? My apologies, I'm new to MXNet. Would really appreciate the help, thanks!
Relevant answers, but not specific enough to my particular problem: https://github.com/apache/incubator-mxnet/issues/772 siamese networks https://github.com/apache/incubator-mxnet/issues/6791 extract layers as variables https://github.com/apache/incubator-mxnet/issues/557 set weights to be same