apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.78k stars 6.79k forks source link

Cannot reproduce a model using the model parameters. #6020

Closed FCInter closed 7 years ago

FCInter commented 7 years ago

Environment info

Operating System: CentOS 6.6

Compiler: gcc-4.8

Package used (Python/R/Scala/Julia): Python

MXNet version:

Or if installed from source: from source

MXNet commit hash (git rev-parse HEAD):

If you are using python package, please provide

Python version and distribution: Python 2.7.12 |Anaconda custom (64-bit)

Error Message:

Cannot reproduce a model using the model parameters.

Minimum reproducible example

I'm trying to reproduce the feed-forward step of an autoencoder by myself using python. I first train the model and save the params in files. Then I implement the feed-forward step and load the params. Then I feed data into it. If correctly implemented, the feed-forward step should yield the same output as that yielded by the original model in mxnet. However, the result I get is different. I don't understand what each argument means, as show in the example here)

args = {'encoder_%d_weight'%istack: mx.nd.empty((num_hidden, num_input,global_kx,global_ky), self.xpu),
                'encoder_%d_bias'%istack: mx.nd.empty((num_hidden,), self.xpu),
                'decoder_%d_weight'%istack: mx.nd.empty((num_hidden, num_input,global_kx,global_ky), self.xpu),
                'decoder_%d_bias'%istack: mx.nd.empty((num_hidden,), self.xpu),}
args_grad = {'encoder_%d_weight'%istack: mx.nd.empty((num_hidden, num_input,global_kx,global_ky), self.xpu),
                     'encoder_%d_bias'%istack: mx.nd.empty((num_hidden,), self.xpu),
                     'decoder_%d_weight'%istack: mx.nd.empty((num_hidden, num_input,global_kx,global_ky), self.xpu),
                     'decoder_%d_bias'%istack: mx.nd.empty((num_hidden,), self.xpu),}
args_mult = {'encoder_%d_weight'%istack: 1.0,
                     'encoder_%d_bias'%istack: 2.0,
                     'decoder_%d_weight'%istack: 1.0,
                     'decoder_%d_bias'%istack: 2.0,}
auxs = {}
if encoder_act == 'sigmoid' and sparseness_penalty:
    auxs['sparse_encoder_%d_moving_avg' % istack] = mx.nd.ones((num_hidden), self.xpu) * 0.5
if decoder_act == 'sigmoid' and sparseness_penalty:
    auxs['sparse_decoder_%d_moving_avg' % istack] = mx.nd.ones((num_hidden), self.xpu) * 0.5
init = mx.initializer.Uniform(0.07)
for k,v in args.items():
    init(k,v)

I don't understand what each term means, i.e. args, args_grad, args_mult, auxs. I only know that the args is used to multiply the data items. I only save the args in files and load it into my own implementation. I did not load other params. I think this would be the reason why my implementation is different from mxnet. But what these params mean? Could someone tell me? Thank you all for helping me!!!

szha commented 7 years ago

This issue is closed due to lack of activity in the last 90 days. Feel free to ping me to reopen if this is still an active issue. Thanks!