NervanaSystems / neon

Intel® Nervana™ reference deep learning framework committed to best performance on all hardware
http://neon.nervanasys.com/docs/latest
Apache License 2.0
3.87k stars 812 forks source link

Assertion Error loading weights to hidden layers #417

Closed pantherso48 closed 6 years ago

pantherso48 commented 6 years ago

Hi guys,

An assertion error is thrown when I run this line of code: layer.load_weights(params)

I looked at the source code and it looks like I might be missing an argument in the function called 'self'. Not sure what is meant by self and if I missed the documentation for it I apologize. I know I do not need the load_states argument since it defaults to true.

Full code here, similar to VGG example: param_layers = [l for l in model.layers.layers] param_dict_list = trained_vgg['model']['config']['layers'] for layer, params in zip(param_layers, param_dict_list): if(layer.name == 'class_layer'): break print(params) layer.load_weights(params)

wei-v-wang commented 6 years ago

Hi @pantherso48 In neon v2.3, we had an internal layers layout change (fused convolution + bias layer) for performance reasons. Weight loading code that worked in neon v2.2 should be changed in order to work for neon v2.3.

Are you loading a weight file generated by neon v2.2?

If yes, instead of loading the old weight file (e.g. old VGG) into your customized model (e.g. fasteeeeest_rcnn) that uses a subset of the layers in the old VGG model and weight file, our recommendation (sorry, working on this part of the documentation) is to convert the old VGG model file to new format, according to suggestions in https://github.com/NervanaSystems/neon/issues/415

Is the weight from your customized model, or is it from us?

Thanks for reporting the issue.

pantherso48 commented 6 years ago

The weight file is from your s3 bucket: url = 'https://s3-us-west-1.amazonaws.com/nervana-modelzoo/VGG/' filename = 'VGG_D.p'

I will work through #415 and try to convert the VGG weights file into the new format. So I understand this more clearly as I look at the source code in Neon: assert type(pdict) is dict --> 736 for key in pdict['params']: 737 if not hasattr(self, key):

If there is no attribute for a key then it throws an error, so the following param I send to this function does not have an attribute with the key config so that is causing the error? {'config': {'transform': {'config': {'name': 'Rectlin_0'}, 'type': 'neon.transforms.activation.Rectlin'}, 'name': 'Convolution_11_Rectlin'}, 'type': 'neon.layers.layer.Activation'}

I will try the conversion and report back, thanks for the help!

wei-v-wang commented 6 years ago

For VGG_D.p, we have already converted and is available url = 'https://s3-us-west-1.amazonaws.com/nervana-modelzoo/VGG/' filename = 'VGG_D_fused_conv_bias.p'

Can you try replacing VGG_D.p with VGG_D_fused_conv_bias.p?

pantherso48 commented 6 years ago

That worked, just needed to update the load_states to false. Thanks!

I could update the tutorial 2 documentation and send a push request if you want also while it is fresh in my head. Thanks again.

wei-v-wang commented 6 years ago

Yes, please help do so and create a PR. Thanks!

pantherso48 commented 6 years ago

I'm running the following script in a Jupyter terminal which has neon installed and it keeps arguing the --out_dir flag but the data.py file is looking for that argument in the source code. Any insight would be appreciated, thanks!

jupyter notebook data.py --out_dir data/cifar10 image

Source code:

if __name__ == '__main__':
   from configargparse import ArgumentParser
   parser = ArgumentParser()
   parser.add_argument('--out_dir', required=True, help='Directory to write ingested files')
   parser.add_argument('--padded_size', type=int, default=40, help='Size of image after padding')
   parser.add_argument('--overwrite', action='store_true', default=False, help='Overwrite files')
   args = parser.parse_args()

ingest_cifar10(args.out_dir, args.padded_size, overwrite=args.overwrite)
wei-v-wang commented 6 years ago

Is "Bad config encountered during initialization" causing the "unrecognized flag" issue?

the cifar10 example should work with "python data.py --out_dir data/cifar10", right?

pantherso48 commented 6 years ago

Using jupyter notebook cli was the problem, now I am getting an import error on an import that is in my env: image

pantherso48 commented 6 years ago

This was a python version error ran it with python3 and worked great.

wei-v-wang commented 6 years ago

Great!

I am closing this issue, feel free to open new issues.