allenai / allennlp

An open-source NLP research library, built on PyTorch.
http://www.allennlp.org
Apache License 2.0
11.76k stars 2.25k forks source link

NOt able to use QaNet #3056

Closed Swathygsb closed 5 years ago

Swathygsb commented 5 years ago

return cls.by_name(model_type)._load(config, serialization_dir, weights_file, cuda_device) File "/home/ssathishbarani/anaconda3/envs/allennlp/lib/python3.6/site-packages/allennlp/models/model.py", line 276, in _load model.load_state_dict(model_state) File "/home/ssathishbarani/anaconda3/envs/allennlp/lib/python3.6/site-packages/torch/nn/modules/module.py", line 777, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for NumericallyAugmentedQaNet: Missing key(s) in state_dict: "_phrase_layer._encoder_blocks.0._conv_norm_layers.0.weight", "_phrase_layer._encoder_blocks.0._conv_norm_layers.0.bias", "_phrase_layer._encoder_blocks.0._conv_norm_layers.1.weight",

schmmd commented 5 years ago

@Swathygsb for us to help you, we need more information. We would like to know the system you are running on, which version of AllenNLP you are using, and instructions for how we can reproduce the error you received.

Swathygsb commented 5 years ago

I tried on my local machine which doesn't have a GPU. Ubuntu OS, 64 GB RAM, pulled the latest code from AllenNLP GitHub. I am getting this error, when trying to load the model, where it fails during extraction itself. Predictor.from_path (--path for qanet -s3 model file)

schmmd commented 5 years ago

@Swathygsb typically we expect something like the following:

Steps to reproduce (run on 5c64f9d01ef39e3398372ebe4f19f864691679c0):

from allennlp.predictors import Predictor
Predictor.from_path("https://s3-us-west-2.amazonaws.com/allennlp/models/qanet-glove-2019.05.09.tar.gz")

Full stacktrace:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/michaels/hack/allenai/allennlp/allennlp/predictors/predictor.py", line 152, in from_path
    return Predictor.from_archive(load_archive(archive_path, cuda_device=cuda_device), predictor_name)
  File "/Users/michaels/hack/allenai/allennlp/allennlp/models/archival.py", line 230, in load_archive
    cuda_device=cuda_device)
  File "/Users/michaels/hack/allenai/allennlp/allennlp/models/model.py", line 327, in load
    return cls.by_name(model_type)._load(config, serialization_dir, weights_file, cuda_device)
  File "/Users/michaels/hack/allenai/allennlp/allennlp/models/model.py", line 276, in _load
    model.load_state_dict(model_state)
  File "/Users/michaels/miniconda3/envs/allennlp/lib/python3.6/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for QaNet:
    Missing key(s) in state_dict: "_phrase_layer._encoder_blocks.0._conv_norm_layers.0.weight", "_phrase_layer._encoder_blocks.0._conv_norm_layers.0.bias", "_phrase_layer._encoder_blocks.0._conv_norm_layers.1.weight", "_phrase_layer._encoder_blocks.0._conv_norm_layers.1.bias", "_phrase_layer._encoder_blocks.0._conv_norm_layers.2.weight", "_phrase_layer._encoder_blocks.0._conv_norm_layers.2.bias", "_phrase_layer._encoder_blocks.0._conv_norm_layers.3.weight", "_phrase_layer._encoder_blocks.0._conv_norm_layers.3.bias", "_phrase_layer._encoder_blocks.0._conv_layers.0.1.weight", "_phrase_layer._encoder_blocks.0._conv_layers.0.1.bias", "_phrase_layer._encoder_blocks.0._conv_layers.0.2.weight", "_phrase_layer._encoder_blocks.0._conv_layers.0.2.bias", "_phrase_layer._encoder_blocks.0._conv_layers.1.1.weight", "_phrase_layer._encoder_blocks.0._conv_layers.1.1.bias", "_phrase_layer._encoder_blocks.0._conv_layers.1.2.weight", "_phrase_layer._encoder_blocks.0._conv_layers.1.2.bias", "_phrase_layer._encoder_blocks.0._conv_layers.2.1.weight", "_phrase_layer._encoder_blocks.0._conv_layers.2.1.bias", "_phrase_layer._encoder_blocks.0._conv_layers.2.2.weight", "_phrase_layer._encoder_blocks.0._conv_layers.2.2.bias", "_phrase_layer._encoder_blocks.0._conv_layers.3.1.weight", "_phrase_layer._encoder_blocks.0._conv_layers.3.1.bias", "_phrase_layer._encoder_blocks.0._conv_layers.3.2.weight", "_phrase_layer._encoder_blocks.0._conv_layers.3.2.bias", "_phrase_layer._encoder_blocks.0.attention_norm_layer.weight", "_phrase_layer._encoder_blocks.0.attention_norm_layer.bias", "_phrase_layer._encoder_blocks.0.attention_layer._combined_projection.weight", "_phrase_layer._encoder_blocks.0.attention_layer._combined_projection.bias", "_phrase_layer._encoder_blocks.0.attention_layer._output_projection.weight", "_phrase_layer._encoder_blocks.0.attention_layer._output_projection.bias", "_phrase_layer._encoder_blocks.0.feedforward_norm_layer.weight", "_phrase_layer._encoder_blocks.0.feedforward_norm_layer.bias", "_phrase_layer._encoder_blocks.0.feedforward._linear_layers.0.weight", "_phrase_layer._encoder_blocks.0.feedforward._linear_layers.0.bias", "_phrase_layer._encoder_blocks.0.feedforward._linear_layers.1.weight", "_phrase_layer._encoder_blocks.0.feedforward._linear_layers.1.bias", "_modeling_layer._encoder_blocks.0._conv_norm_layers.0.weight", "_modeling_layer._encoder_blocks.0._conv_norm_layers.0.bias", "_modeling_layer._encoder_blocks.0._conv_norm_layers.1.weight", "_modeling_layer._encoder_blocks.0._conv_norm_layers.1.bias", "_modeling_layer._encoder_blocks.0._conv_layers.0.1.weight", "_modeling_layer._encoder_blocks.0._conv_layers.0.1.bias", "_modeling_layer._encoder_blocks.0._conv_layers.0.2.weight", "_modeling_layer._encoder_blocks.0._conv_layers.0.2.bias", "_modeling_layer._encoder_blocks.0._conv_layers.1.1.weight", "_modeling_layer._encoder_blocks.0._conv_layers.1.1.bias", "_modeling_layer._encoder_blocks.0._conv_layers.1.2.weight", "_modeling_layer._encoder_blocks.0._conv_layers.1.2.bias", "_modeling_layer._encoder_blocks.0.attention_norm_layer.weight", "_modeling_layer._encoder_blocks.0.attention_norm_layer.bias", "_modeling_layer._encoder_blocks.0.attention_layer._combined_projection.weight", "_modeling_layer._encoder_blocks.0.attention_layer._combined_projection.bias", "_modeling_layer._encoder_blocks.0.attention_layer._output_projection.weight", "_modeling_layer._encoder_blocks.0.attention_layer._output_projection.bias", "_modeling_layer._encoder_blocks.0.feedforward_norm_layer.weight", "_modeling_layer._encoder_blocks.0.feedforward_norm_layer.bias", "_modeling_layer._encoder_blocks.0.feedforward._linear_layers.0.weight", "_modeling_layer._encoder_blocks.0.feedforward._linear_layers.0.bias", "_modeling_layer._encoder_blocks.0.feedforward._linear_layers.1.weight", "_modeling_layer._encoder_blocks.0.feedforward._linear_layers.1.bias", "_modeling_layer._encoder_blocks.1._conv_norm_layers.0.weight", "_modeling_layer._encoder_blocks.1._conv_norm_layers.0.bias", "_modeling_layer._encoder_blocks.1._conv_norm_layers.1.weight", "_modeling_layer._encoder_blocks.1._conv_norm_layers.1.bias", "_modeling_layer._encoder_blocks.1._conv_layers.0.1.weight", "_modeling_layer._encoder_blocks.1._conv_layers.0.1.bias", "_modeling_layer._encoder_blocks.1._conv_layers.0.2.weight", "_modeling_layer._encoder_blocks.1._conv_layers.0.2.bias", "_modeling_layer._encoder_blocks.1._conv_layers.1.1.weight", "_modeling_layer._encoder_blocks.1._conv_layers.1.1.bias", "_modeling_layer._encoder_blocks.1._conv_layers.1.2.weight", "_modeling_layer._encoder_blocks.1._conv_layers.1.2.bias", "_modeling_layer._encoder_blocks.1.attention_norm_layer.weight", "_modeling_layer._encoder_blocks.1.attention_norm_layer.bias", "_modeling_layer._encoder_blocks.1.attention_layer._combined_projection.weight", "_modeling_layer._encoder_blocks.1.attention_layer._combined_projection.bias", "_modeling_layer._encoder_blocks.1.attention_layer._output_projection.weight", "_modeling_layer._encoder_blocks.1.attention_layer._output_projection.bias", "_modeling_layer._encoder_blocks.1.feedforward_norm_layer.weight", "_modeling_layer._encoder_blocks.1.feedforward_norm_layer.bias", "_modeling_layer._encoder_blocks.1.feedforward._linear_layers.0.weight", "_modeling_layer._encoder_blocks.1.feedforward._linear_layers.0.bias", "_modeling_layer._encoder_blocks.1.feedforward._linear_layers.1.weight", "_modeling_layer._encoder_blocks.1.feedforward._linear_layers.1.bias", "_modeling_layer._encoder_blocks.2._conv_norm_layers.0.weight", "_modeling_layer._encoder_blocks.2._conv_norm_layers.0.bias", "_modeling_layer._encoder_blocks.2._conv_norm_layers.1.weight", "_modeling_layer._encoder_blocks.2._conv_norm_layers.1.bias", "_modeling_layer._encoder_blocks.2._conv_layers.0.1.weight", "_modeling_layer._encoder_blocks.2._conv_layers.0.1.bias", "_modeling_layer._encoder_blocks.2._conv_layers.0.2.weight", "_modeling_layer._encoder_blocks.2._conv_layers.0.2.bias", "_modeling_layer._encoder_blocks.2._conv_layers.1.1.weight", "_modeling_layer._encoder_blocks.2._conv_layers.1.1.bias", "_modeling_layer._encoder_blocks.2._conv_layers.1.2.weight", "_modeling_layer._encoder_blocks.2._conv_layers.1.2.bias", "_modeling_layer._encoder_blocks.2.attention_norm_layer.weight", "_modeling_layer._encoder_blocks.2.attention_norm_layer.bias", "_modeling_layer._encoder_blocks.2.attention_layer._combined_projection.weight", "_modeling_layer._encoder_blocks.2.attention_layer._combined_projection.bias", "_modeling_layer._encoder_blocks.2.attention_layer._output_projection.weight", "_modeling_layer._encoder_blocks.2.attention_layer._output_projection.bias", "_modeling_layer._encoder_blocks.2.feedforward_norm_layer.weight", "_modeling_layer._encoder_blocks.2.feedforward_norm_layer.bias", "_modeling_layer._encoder_blocks.2.feedforward._linear_layers.0.weight", "_modeling_layer._encoder_blocks.2.feedforward._linear_layers.0.bias", "_modeling_layer._encoder_blocks.2.feedforward._linear_layers.1.weight", "_modeling_layer._encoder_blocks.2.feedforward._linear_layers.1.bias", "_modeling_layer._encoder_blocks.3._conv_norm_layers.0.weight", "_modeling_layer._encoder_blocks.3._conv_norm_layers.0.bias", "_modeling_layer._encoder_blocks.3._conv_norm_layers.1.weight", "_modeling_layer._encoder_blocks.3._conv_norm_layers.1.bias", "_modeling_layer._encoder_blocks.3._conv_layers.0.1.weight", "_modeling_layer._encoder_blocks.3._conv_layers.0.1.bias", "_modeling_layer._encoder_blocks.3._conv_layers.0.2.weight", "_modeling_layer._encoder_blocks.3._conv_layers.0.2.bias", "_modeling_layer._encoder_blocks.3._conv_layers.1.1.weight", "_modeling_layer._encoder_blocks.3._conv_layers.1.1.bias", "_modeling_layer._encoder_blocks.3._conv_layers.1.2.weight", "_modeling_layer._encoder_blocks.3._conv_layers.1.2.bias", "_modeling_layer._encoder_blocks.3.attention_norm_layer.weight", "_modeling_layer._encoder_blocks.3.attention_norm_layer.bias", "_modeling_layer._encoder_blocks.3.attention_layer._combined_projection.weight", "_modeling_layer._encoder_blocks.3.attention_layer._combined_projection.bias", "_modeling_layer._encoder_blocks.3.attention_layer._output_projection.weight", "_modeling_layer._encoder_blocks.3.attention_layer._output_projection.bias", "_modeling_layer._encoder_blocks.3.feedforward_norm_layer.weight", "_modeling_layer._encoder_blocks.3.feedforward_norm_layer.bias", "_modeling_layer._encoder_blocks.3.feedforward._linear_layers.0.weight", "_modeling_layer._encoder_blocks.3.feedforward._linear_layers.0.bias", "_modeling_layer._encoder_blocks.3.feedforward._linear_layers.1.weight", "_modeling_layer._encoder_blocks.3.feedforward._linear_layers.1.bias", "_modeling_layer._encoder_blocks.4._conv_norm_layers.0.weight", "_modeling_layer._encoder_blocks.4._conv_norm_layers.0.bias", "_modeling_layer._encoder_blocks.4._conv_norm_layers.1.weight", "_modeling_layer._encoder_blocks.4._conv_norm_layers.1.bias", "_modeling_layer._encoder_blocks.4._conv_layers.0.1.weight", "_modeling_layer._encoder_blocks.4._conv_layers.0.1.bias", "_modeling_layer._encoder_blocks.4._conv_layers.0.2.weight", "_modeling_layer._encoder_blocks.4._conv_layers.0.2.bias", "_modeling_layer._encoder_blocks.4._conv_layers.1.1.weight", "_modeling_layer._encoder_blocks.4._conv_layers.1.1.bias", "_modeling_layer._encoder_blocks.4._conv_layers.1.2.weight", "_modeling_layer._encoder_blocks.4._conv_layers.1.2.bias", "_modeling_layer._encoder_blocks.4.attention_norm_layer.weight", "_modeling_layer._encoder_blocks.4.attention_norm_layer.bias", "_modeling_layer._encoder_blocks.4.attention_layer._combined_projection.weight", "_modeling_layer._encoder_blocks.4.attention_layer._combined_projection.bias", "_modeling_layer._encoder_blocks.4.attention_layer._output_projection.weight", "_modeling_layer._encoder_blocks.4.attention_layer._output_projection.bias", "_modeling_layer._encoder_blocks.4.feedforward_norm_layer.weight", "_modeling_layer._encoder_blocks.4.feedforward_norm_layer.bias", "_modeling_layer._encoder_blocks.4.feedforward._linear_layers.0.weight", "_modeling_layer._encoder_blocks.4.feedforward._linear_layers.0.bias", "_modeling_layer._encoder_blocks.4.feedforward._linear_layers.1.weight", "_modeling_layer._encoder_blocks.4.feedforward._linear_layers.1.bias", "_modeling_layer._encoder_blocks.5._conv_norm_layers.0.weight", "_modeling_layer._encoder_blocks.5._conv_norm_layers.0.bias", "_modeling_layer._encoder_blocks.5._conv_norm_layers.1.weight", "_modeling_layer._encoder_blocks.5._conv_norm_layers.1.bias", "_modeling_layer._encoder_blocks.5._conv_layers.0.1.weight", "_modeling_layer._encoder_blocks.5._conv_layers.0.1.bias", "_modeling_layer._encoder_blocks.5._conv_layers.0.2.weight", "_modeling_layer._encoder_blocks.5._conv_layers.0.2.bias", "_modeling_layer._encoder_blocks.5._conv_layers.1.1.weight", "_modeling_layer._encoder_blocks.5._conv_layers.1.1.bias", "_modeling_layer._encoder_blocks.5._conv_layers.1.2.weight", "_modeling_layer._encoder_blocks.5._conv_layers.1.2.bias", "_modeling_layer._encoder_blocks.5.attention_norm_layer.weight", "_modeling_layer._encoder_blocks.5.attention_norm_layer.bias", "_modeling_layer._encoder_blocks.5.attention_layer._combined_projection.weight", "_modeling_layer._encoder_blocks.5.attention_layer._combined_projection.bias", "_modeling_layer._encoder_blocks.5.attention_layer._output_projection.weight", "_modeling_layer._encoder_blocks.5.attention_layer._output_projection.bias", "_modeling_layer._encoder_blocks.5.feedforward_norm_layer.weight", "_modeling_layer._encoder_blocks.5.feedforward_norm_layer.bias", "_modeling_layer._encoder_blocks.5.feedforward._linear_layers.0.weight", "_modeling_layer._encoder_blocks.5.feedforward._linear_layers.0.bias", "_modeling_layer._encoder_blocks.5.feedforward._linear_layers.1.weight", "_modeling_layer._encoder_blocks.5.feedforward._linear_layers.1.bias", "_modeling_layer._encoder_blocks.6._conv_norm_layers.0.weight", "_modeling_layer._encoder_blocks.6._conv_norm_layers.0.bias", "_modeling_layer._encoder_blocks.6._conv_norm_layers.1.weight", "_modeling_layer._encoder_blocks.6._conv_norm_layers.1.bias", "_modeling_layer._encoder_blocks.6._conv_layers.0.1.weight", "_modeling_layer._encoder_blocks.6._conv_layers.0.1.bias", "_modeling_layer._encoder_blocks.6._conv_layers.0.2.weight", "_modeling_layer._encoder_blocks.6._conv_layers.0.2.bias", "_modeling_layer._encoder_blocks.6._conv_layers.1.1.weight", "_modeling_layer._encoder_blocks.6._conv_layers.1.1.bias", "_modeling_layer._encoder_blocks.6._conv_layers.1.2.weight", "_modeling_layer._encoder_blocks.6._conv_layers.1.2.bias", "_modeling_layer._encoder_blocks.6.attention_norm_layer.weight", "_modeling_layer._encoder_blocks.6.attention_norm_layer.bias", "_modeling_layer._encoder_blocks.6.attention_layer._combined_projection.weight", "_modeling_layer._encoder_blocks.6.attention_layer._combined_projection.bias", "_modeling_layer._encoder_blocks.6.attention_layer._output_projection.weight", "_modeling_layer._encoder_blocks.6.attention_layer._output_projection.bias", "_modeling_layer._encoder_blocks.6.feedforward_norm_layer.weight", "_modeling_layer._encoder_blocks.6.feedforward_norm_layer.bias", "_modeling_layer._encoder_blocks.6.feedforward._linear_layers.0.weight", "_modeling_layer._encoder_blocks.6.feedforward._linear_layers.0.bias", "_modeling_layer._encoder_blocks.6.feedforward._linear_layers.1.weight", "_modeling_layer._encoder_blocks.6.feedforward._linear_layers.1.bias". 
    Unexpected key(s) in state_dict: "_phrase_layer.encoder_block_0._conv_norm_layers.0.weight", "_phrase_layer.encoder_block_0._conv_norm_layers.0.bias", "_phrase_layer.encoder_block_0._conv_norm_layers.1.weight", "_phrase_layer.encoder_block_0._conv_norm_layers.1.bias", "_phrase_layer.encoder_block_0._conv_norm_layers.2.weight", "_phrase_layer.encoder_block_0._conv_norm_layers.2.bias", "_phrase_layer.encoder_block_0._conv_norm_layers.3.weight", "_phrase_layer.encoder_block_0._conv_norm_layers.3.bias", "_phrase_layer.encoder_block_0._conv_layers.0.1.weight", "_phrase_layer.encoder_block_0._conv_layers.0.1.bias", "_phrase_layer.encoder_block_0._conv_layers.0.2.weight", "_phrase_layer.encoder_block_0._conv_layers.0.2.bias", "_phrase_layer.encoder_block_0._conv_layers.1.1.weight", "_phrase_layer.encoder_block_0._conv_layers.1.1.bias", "_phrase_layer.encoder_block_0._conv_layers.1.2.weight", "_phrase_layer.encoder_block_0._conv_layers.1.2.bias", "_phrase_layer.encoder_block_0._conv_layers.2.1.weight", "_phrase_layer.encoder_block_0._conv_layers.2.1.bias", "_phrase_layer.encoder_block_0._conv_layers.2.2.weight", "_phrase_layer.encoder_block_0._conv_layers.2.2.bias", "_phrase_layer.encoder_block_0._conv_layers.3.1.weight", "_phrase_layer.encoder_block_0._conv_layers.3.1.bias", "_phrase_layer.encoder_block_0._conv_layers.3.2.weight", "_phrase_layer.encoder_block_0._conv_layers.3.2.bias", "_phrase_layer.encoder_block_0.attention_norm_layer.weight", "_phrase_layer.encoder_block_0.attention_norm_layer.bias", "_phrase_layer.encoder_block_0.attention_layer._combined_projection.weight", "_phrase_layer.encoder_block_0.attention_layer._combined_projection.bias", "_phrase_layer.encoder_block_0.attention_layer._output_projection.weight", "_phrase_layer.encoder_block_0.attention_layer._output_projection.bias", "_phrase_layer.encoder_block_0.feedforward_norm_layer.weight", "_phrase_layer.encoder_block_0.feedforward_norm_layer.bias", "_phrase_layer.encoder_block_0.feedforward._linear_layers.0.weight", "_phrase_layer.encoder_block_0.feedforward._linear_layers.0.bias", "_phrase_layer.encoder_block_0.feedforward._linear_layers.1.weight", "_phrase_layer.encoder_block_0.feedforward._linear_layers.1.bias", "_modeling_layer.encoder_block_0._conv_norm_layers.0.weight", "_modeling_layer.encoder_block_0._conv_norm_layers.0.bias", "_modeling_layer.encoder_block_0._conv_norm_layers.1.weight", "_modeling_layer.encoder_block_0._conv_norm_layers.1.bias", "_modeling_layer.encoder_block_0._conv_layers.0.1.weight", "_modeling_layer.encoder_block_0._conv_layers.0.1.bias", "_modeling_layer.encoder_block_0._conv_layers.0.2.weight", "_modeling_layer.encoder_block_0._conv_layers.0.2.bias", "_modeling_layer.encoder_block_0._conv_layers.1.1.weight", "_modeling_layer.encoder_block_0._conv_layers.1.1.bias", "_modeling_layer.encoder_block_0._conv_layers.1.2.weight", "_modeling_layer.encoder_block_0._conv_layers.1.2.bias", "_modeling_layer.encoder_block_0.attention_norm_layer.weight", "_modeling_layer.encoder_block_0.attention_norm_layer.bias", "_modeling_layer.encoder_block_0.attention_layer._combined_projection.weight", "_modeling_layer.encoder_block_0.attention_layer._combined_projection.bias", "_modeling_layer.encoder_block_0.attention_layer._output_projection.weight", "_modeling_layer.encoder_block_0.attention_layer._output_projection.bias", "_modeling_layer.encoder_block_0.feedforward_norm_layer.weight", "_modeling_layer.encoder_block_0.feedforward_norm_layer.bias", "_modeling_layer.encoder_block_0.feedforward._linear_layers.0.weight", "_modeling_layer.encoder_block_0.feedforward._linear_layers.0.bias", "_modeling_layer.encoder_block_0.feedforward._linear_layers.1.weight", "_modeling_layer.encoder_block_0.feedforward._linear_layers.1.bias", "_modeling_layer.encoder_block_1._conv_norm_layers.0.weight", "_modeling_layer.encoder_block_1._conv_norm_layers.0.bias", "_modeling_layer.encoder_block_1._conv_norm_layers.1.weight", "_modeling_layer.encoder_block_1._conv_norm_layers.1.bias", "_modeling_layer.encoder_block_1._conv_layers.0.1.weight", "_modeling_layer.encoder_block_1._conv_layers.0.1.bias", "_modeling_layer.encoder_block_1._conv_layers.0.2.weight", "_modeling_layer.encoder_block_1._conv_layers.0.2.bias", "_modeling_layer.encoder_block_1._conv_layers.1.1.weight", "_modeling_layer.encoder_block_1._conv_layers.1.1.bias", "_modeling_layer.encoder_block_1._conv_layers.1.2.weight", "_modeling_layer.encoder_block_1._conv_layers.1.2.bias", "_modeling_layer.encoder_block_1.attention_norm_layer.weight", "_modeling_layer.encoder_block_1.attention_norm_layer.bias", "_modeling_layer.encoder_block_1.attention_layer._combined_projection.weight", "_modeling_layer.encoder_block_1.attention_layer._combined_projection.bias", "_modeling_layer.encoder_block_1.attention_layer._output_projection.weight", "_modeling_layer.encoder_block_1.attention_layer._output_projection.bias", "_modeling_layer.encoder_block_1.feedforward_norm_layer.weight", "_modeling_layer.encoder_block_1.feedforward_norm_layer.bias", "_modeling_layer.encoder_block_1.feedforward._linear_layers.0.weight", "_modeling_layer.encoder_block_1.feedforward._linear_layers.0.bias", "_modeling_layer.encoder_block_1.feedforward._linear_layers.1.weight", "_modeling_layer.encoder_block_1.feedforward._linear_layers.1.bias", "_modeling_layer.encoder_block_2._conv_norm_layers.0.weight", "_modeling_layer.encoder_block_2._conv_norm_layers.0.bias", "_modeling_layer.encoder_block_2._conv_norm_layers.1.weight", "_modeling_layer.encoder_block_2._conv_norm_layers.1.bias", "_modeling_layer.encoder_block_2._conv_layers.0.1.weight", "_modeling_layer.encoder_block_2._conv_layers.0.1.bias", "_modeling_layer.encoder_block_2._conv_layers.0.2.weight", "_modeling_layer.encoder_block_2._conv_layers.0.2.bias", "_modeling_layer.encoder_block_2._conv_layers.1.1.weight", "_modeling_layer.encoder_block_2._conv_layers.1.1.bias", "_modeling_layer.encoder_block_2._conv_layers.1.2.weight", "_modeling_layer.encoder_block_2._conv_layers.1.2.bias", "_modeling_layer.encoder_block_2.attention_norm_layer.weight", "_modeling_layer.encoder_block_2.attention_norm_layer.bias", "_modeling_layer.encoder_block_2.attention_layer._combined_projection.weight", "_modeling_layer.encoder_block_2.attention_layer._combined_projection.bias", "_modeling_layer.encoder_block_2.attention_layer._output_projection.weight", "_modeling_layer.encoder_block_2.attention_layer._output_projection.bias", "_modeling_layer.encoder_block_2.feedforward_norm_layer.weight", "_modeling_layer.encoder_block_2.feedforward_norm_layer.bias", "_modeling_layer.encoder_block_2.feedforward._linear_layers.0.weight", "_modeling_layer.encoder_block_2.feedforward._linear_layers.0.bias", "_modeling_layer.encoder_block_2.feedforward._linear_layers.1.weight", "_modeling_layer.encoder_block_2.feedforward._linear_layers.1.bias", "_modeling_layer.encoder_block_3._conv_norm_layers.0.weight", "_modeling_layer.encoder_block_3._conv_norm_layers.0.bias", "_modeling_layer.encoder_block_3._conv_norm_layers.1.weight", "_modeling_layer.encoder_block_3._conv_norm_layers.1.bias", "_modeling_layer.encoder_block_3._conv_layers.0.1.weight", "_modeling_layer.encoder_block_3._conv_layers.0.1.bias", "_modeling_layer.encoder_block_3._conv_layers.0.2.weight", "_modeling_layer.encoder_block_3._conv_layers.0.2.bias", "_modeling_layer.encoder_block_3._conv_layers.1.1.weight", "_modeling_layer.encoder_block_3._conv_layers.1.1.bias", "_modeling_layer.encoder_block_3._conv_layers.1.2.weight", "_modeling_layer.encoder_block_3._conv_layers.1.2.bias", "_modeling_layer.encoder_block_3.attention_norm_layer.weight", "_modeling_layer.encoder_block_3.attention_norm_layer.bias", "_modeling_layer.encoder_block_3.attention_layer._combined_projection.weight", "_modeling_layer.encoder_block_3.attention_layer._combined_projection.bias", "_modeling_layer.encoder_block_3.attention_layer._output_projection.weight", "_modeling_layer.encoder_block_3.attention_layer._output_projection.bias", "_modeling_layer.encoder_block_3.feedforward_norm_layer.weight", "_modeling_layer.encoder_block_3.feedforward_norm_layer.bias", "_modeling_layer.encoder_block_3.feedforward._linear_layers.0.weight", "_modeling_layer.encoder_block_3.feedforward._linear_layers.0.bias", "_modeling_layer.encoder_block_3.feedforward._linear_layers.1.weight", "_modeling_layer.encoder_block_3.feedforward._linear_layers.1.bias", "_modeling_layer.encoder_block_4._conv_norm_layers.0.weight", "_modeling_layer.encoder_block_4._conv_norm_layers.0.bias", "_modeling_layer.encoder_block_4._conv_norm_layers.1.weight", "_modeling_layer.encoder_block_4._conv_norm_layers.1.bias", "_modeling_layer.encoder_block_4._conv_layers.0.1.weight", "_modeling_layer.encoder_block_4._conv_layers.0.1.bias", "_modeling_layer.encoder_block_4._conv_layers.0.2.weight", "_modeling_layer.encoder_block_4._conv_layers.0.2.bias", "_modeling_layer.encoder_block_4._conv_layers.1.1.weight", "_modeling_layer.encoder_block_4._conv_layers.1.1.bias", "_modeling_layer.encoder_block_4._conv_layers.1.2.weight", "_modeling_layer.encoder_block_4._conv_layers.1.2.bias", "_modeling_layer.encoder_block_4.attention_norm_layer.weight", "_modeling_layer.encoder_block_4.attention_norm_layer.bias", "_modeling_layer.encoder_block_4.attention_layer._combined_projection.weight", "_modeling_layer.encoder_block_4.attention_layer._combined_projection.bias", "_modeling_layer.encoder_block_4.attention_layer._output_projection.weight", "_modeling_layer.encoder_block_4.attention_layer._output_projection.bias", "_modeling_layer.encoder_block_4.feedforward_norm_layer.weight", "_modeling_layer.encoder_block_4.feedforward_norm_layer.bias", "_modeling_layer.encoder_block_4.feedforward._linear_layers.0.weight", "_modeling_layer.encoder_block_4.feedforward._linear_layers.0.bias", "_modeling_layer.encoder_block_4.feedforward._linear_layers.1.weight", "_modeling_layer.encoder_block_4.feedforward._linear_layers.1.bias", "_modeling_layer.encoder_block_5._conv_norm_layers.0.weight", "_modeling_layer.encoder_block_5._conv_norm_layers.0.bias", "_modeling_layer.encoder_block_5._conv_norm_layers.1.weight", "_modeling_layer.encoder_block_5._conv_norm_layers.1.bias", "_modeling_layer.encoder_block_5._conv_layers.0.1.weight", "_modeling_layer.encoder_block_5._conv_layers.0.1.bias", "_modeling_layer.encoder_block_5._conv_layers.0.2.weight", "_modeling_layer.encoder_block_5._conv_layers.0.2.bias", "_modeling_layer.encoder_block_5._conv_layers.1.1.weight", "_modeling_layer.encoder_block_5._conv_layers.1.1.bias", "_modeling_layer.encoder_block_5._conv_layers.1.2.weight", "_modeling_layer.encoder_block_5._conv_layers.1.2.bias", "_modeling_layer.encoder_block_5.attention_norm_layer.weight", "_modeling_layer.encoder_block_5.attention_norm_layer.bias", "_modeling_layer.encoder_block_5.attention_layer._combined_projection.weight", "_modeling_layer.encoder_block_5.attention_layer._combined_projection.bias", "_modeling_layer.encoder_block_5.attention_layer._output_projection.weight", "_modeling_layer.encoder_block_5.attention_layer._output_projection.bias", "_modeling_layer.encoder_block_5.feedforward_norm_layer.weight", "_modeling_layer.encoder_block_5.feedforward_norm_layer.bias", "_modeling_layer.encoder_block_5.feedforward._linear_layers.0.weight", "_modeling_layer.encoder_block_5.feedforward._linear_layers.0.bias", "_modeling_layer.encoder_block_5.feedforward._linear_layers.1.weight", "_modeling_layer.encoder_block_5.feedforward._linear_layers.1.bias", "_modeling_layer.encoder_block_6._conv_norm_layers.0.weight", "_modeling_layer.encoder_block_6._conv_norm_layers.0.bias", "_modeling_layer.encoder_block_6._conv_norm_layers.1.weight", "_modeling_layer.encoder_block_6._conv_norm_layers.1.bias", "_modeling_layer.encoder_block_6._conv_layers.0.1.weight", "_modeling_layer.encoder_block_6._conv_layers.0.1.bias", "_modeling_layer.encoder_block_6._conv_layers.0.2.weight", "_modeling_layer.encoder_block_6._conv_layers.0.2.bias", "_modeling_layer.encoder_block_6._conv_layers.1.1.weight", "_modeling_layer.encoder_block_6._conv_layers.1.1.bias", "_modeling_layer.encoder_block_6._conv_layers.1.2.weight", "_modeling_layer.encoder_block_6._conv_layers.1.2.bias", "_modeling_layer.encoder_block_6.attention_norm_layer.weight", "_modeling_layer.encoder_block_6.attention_norm_layer.bias", "_modeling_layer.encoder_block_6.attention_layer._combined_projection.weight", "_modeling_layer.encoder_block_6.attention_layer._combined_projection.bias", "_modeling_layer.encoder_block_6.attention_layer._output_projection.weight", "_modeling_layer.encoder_block_6.attention_layer._output_projection.bias", "_modeling_layer.encoder_block_6.feedforward_norm_layer.weight", "_modeling_layer.encoder_block_6.feedforward_norm_layer.bias", "_modeling_layer.encoder_block_6.feedforward._linear_layers.0.weight", "_modeling_layer.encoder_block_6.feedforward._linear_layers.0.bias", "_modeling_layer.encoder_block_6.feedforward._linear_layers.1.weight", "_modeling_layer.encoder_block_6.feedforward._linear_layers.1.bias".
schmmd commented 5 years ago

@matt-gardner who should I talk to about the QANet model?

matt-gardner commented 5 years ago

This is a duplicate of #2772 and should have been fixed by #2773. You probably are using an old model file.