Open sxjscience opened 4 years ago
Fixed by https://github.com/apache/incubator-mxnet/commit/8c44af4eba798b379c374c15582f9aea7dd7d8fd ?
Need to use deduplicate=True
We should make this default in MXNet 2
Confirmed that using save_parameters(..., deduplicate=True)
will solve this problem.
Let's track changing the default for MXNet 2 in this issue
@leezu Should we submit a PR to change the default behavior? I think we should fix it as early as possible because we rely on save_parameters()
to generate the model zoos.
I find that the load/save logic in Gluon does not respect the
prefix
in the network.Consider the following example, I created two networks,
Foo
andFoo2
, where they both have one dense layer withprefix='layer_'
but with different attribute names. One is calledself.l1
and the other is calledself.l2
. At first glance, because these two layers share the same prefix, we can share the parameters, i.e., directly load the parameters fromfoo
tofoo2
.However, the following code will trigger an error:
Error message:
Thus, Gluon is using the attribute name for sharing the parameters.
To understand the problem, let's consider the following example, in which we create network that has 4 shared dense layers. When we call
save_parameters
, the saved parameters should ideally only contain a single copy of the weights. However, it know contains 4 copies of the weights. This is not acceptable in the deployment setting in which we will have hard constraint on the size of the artifact.Output as follows. We can see that the size of
foo.save_parameters()
will be 4 times the size offoo2.save_parameters()
. However, these two should be the same.