Closed schmmd closed 4 years ago
I know what this is, but to be sure that the whole model still works, I'm retraining it.
Working on RC5, but demo needs update.
$ echo '{"hypothesis": "Two women are sitting on a blanket near some rocks talking about politics.", "premise": "Two women are wandering along the shore drinking iced tea."}' | allennlp predict --predictor textual-entailment https://storage.googleapis.com/allennlp-public-models/snli_roberta-2020.06.09.tar.gz -
2020-06-10 11:07:04,481 - INFO - transformers.file_utils - PyTorch version 1.5.0 available.
2020-06-10 11:07:05,170 - INFO - allennlp.models.archival - loading archive file https://storage.googleapis.com/allennlp-public-models/snli_roberta-2020.06.09.tar.gz from cache at /home/michaels/.allennlp/cache/ac4654b79e3e3499f4004db1032dee531cd5dfe5f762d7369e5b135e1d0943dd.6d05cf1412c6a4849291db3ecba77b2a5367356b56aa8b4036f0e88e243561e8
2020-06-10 11:07:05,171 - INFO - allennlp.models.archival - extracting archive file /home/michaels/.allennlp/cache/ac4654b79e3e3499f4004db1032dee531cd5dfe5f762d7369e5b135e1d0943dd.6d05cf1412c6a4849291db3ecba77b2a5367356b56aa8b4036f0e88e243561e8 to temp dir /tmp/tmpeqwhldni
2020-06-10 11:07:15,074 - INFO - allennlp.common.params - type = from_instances
2020-06-10 11:07:15,074 - INFO - allennlp.data.vocabulary - Loading token dictionary from /tmp/tmpeqwhldni/vocabulary.
2020-06-10 11:07:15,075 - INFO - allennlp.common.params - model.type = basic_classifier
2020-06-10 11:07:15,076 - INFO - allennlp.common.params - model.regularizer = None
2020-06-10 11:07:15,076 - INFO - allennlp.common.params - model.text_field_embedder.type = basic
2020-06-10 11:07:15,076 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.tokens.type = pretrained_transformer
2020-06-10 11:07:15,076 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.tokens.model_name = roberta-large
2020-06-10 11:07:15,076 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.tokens.max_length = 512
2020-06-10 11:07:15,420 - INFO - transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json from cache at /home/michaels/.cache/torch/transformers/c22e0b5bbb7c0cb93a87a2ae01263ae715b4c18d692b1740ce72cacaa99ad184.2d28da311092e99a05f9ee17520204614d60b0bfdb32f8a75644df7737b6a748
2020-06-10 11:07:15,421 - INFO - transformers.configuration_utils - Model config RobertaConfig {
"architectures": [
"RobertaForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"bos_token_id": 0,
"eos_token_id": 2,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 4096,
"layer_norm_eps": 1e-05,
"max_position_embeddings": 514,
"model_type": "roberta",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"pad_token_id": 1,
"type_vocab_size": 1,
"vocab_size": 50265
}
2020-06-10 11:07:15,683 - INFO - transformers.modeling_utils - loading weights file https://cdn.huggingface.co/roberta-large-pytorch_model.bin from cache at /home/michaels/.cache/torch/transformers/2339ac1858323405dffff5156947669fed6f63a0c34cfab35bda4f78791893d2.fc7abf72755ecc4a75d0d336a93c1c63358d2334f5998ed326f3b0da380bf536
2020-06-10 11:07:25,374 - INFO - transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json from cache at /home/michaels/.cache/torch/transformers/c22e0b5bbb7c0cb93a87a2ae01263ae715b4c18d692b1740ce72cacaa99ad184.2d28da311092e99a05f9ee17520204614d60b0bfdb32f8a75644df7737b6a748
2020-06-10 11:07:25,375 - INFO - transformers.configuration_utils - Model config RobertaConfig {
"architectures": [
"RobertaForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"bos_token_id": 0,
"eos_token_id": 2,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 4096,
"layer_norm_eps": 1e-05,
"max_position_embeddings": 514,
"model_type": "roberta",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"pad_token_id": 1,
"type_vocab_size": 1,
"vocab_size": 50265
}
2020-06-10 11:07:26,011 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json from cache at /home/michaels/.cache/torch/transformers/1ae1f5b6e2b22b25ccc04c000bb79ca847aa226d0761536b011cf7e5868f0655.ef00af9e673c7160b4d41cfda1f48c5f4cba57d5142754525572a846a1ab1b9b
2020-06-10 11:07:26,012 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt from cache at /home/michaels/.cache/torch/transformers/f8f83199a6270d582d6245dc100e99c4155de81c9745c6248077018fe01abcfb.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda
2020-06-10 11:07:26,667 - INFO - transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json from cache at /home/michaels/.cache/torch/transformers/c22e0b5bbb7c0cb93a87a2ae01263ae715b4c18d692b1740ce72cacaa99ad184.2d28da311092e99a05f9ee17520204614d60b0bfdb32f8a75644df7737b6a748
2020-06-10 11:07:26,667 - INFO - transformers.configuration_utils - Model config RobertaConfig {
"architectures": [
"RobertaForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"bos_token_id": 0,
"eos_token_id": 2,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 4096,
"layer_norm_eps": 1e-05,
"max_position_embeddings": 514,
"model_type": "roberta",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"pad_token_id": 1,
"type_vocab_size": 1,
"vocab_size": 50265
}
2020-06-10 11:07:27,332 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json from cache at /home/michaels/.cache/torch/transformers/1ae1f5b6e2b22b25ccc04c000bb79ca847aa226d0761536b011cf7e5868f0655.ef00af9e673c7160b4d41cfda1f48c5f4cba57d5142754525572a846a1ab1b9b
2020-06-10 11:07:27,333 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt from cache at /home/michaels/.cache/torch/transformers/f8f83199a6270d582d6245dc100e99c4155de81c9745c6248077018fe01abcfb.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda
2020-06-10 11:07:27,451 - INFO - allennlp.common.params - model.seq2vec_encoder.type = cls_pooler
2020-06-10 11:07:27,452 - INFO - allennlp.common.params - model.seq2vec_encoder.embedding_dim = 1024
2020-06-10 11:07:27,452 - INFO - allennlp.common.params - model.seq2vec_encoder.cls_is_last_token = False
2020-06-10 11:07:27,452 - INFO - allennlp.common.params - model.seq2seq_encoder = None
2020-06-10 11:07:27,452 - INFO - allennlp.common.params - model.feedforward.input_dim = 1024
2020-06-10 11:07:27,452 - INFO - allennlp.common.params - model.feedforward.num_layers = 1
2020-06-10 11:07:27,452 - INFO - allennlp.common.params - model.feedforward.hidden_dims = 1024
2020-06-10 11:07:27,452 - INFO - allennlp.common.params - model.feedforward.activations = tanh
2020-06-10 11:07:27,452 - INFO - allennlp.common.params - type = tanh
2020-06-10 11:07:27,452 - INFO - allennlp.common.params - model.feedforward.dropout = 0.0
2020-06-10 11:07:27,459 - INFO - allennlp.common.params - model.dropout = 0.1
2020-06-10 11:07:27,459 - INFO - allennlp.common.params - model.num_labels = None
2020-06-10 11:07:27,459 - INFO - allennlp.common.params - model.label_namespace = labels
2020-06-10 11:07:27,459 - INFO - allennlp.common.params - model.namespace = tags
2020-06-10 11:07:27,459 - INFO - allennlp.common.params - model.initializer = <allennlp.nn.initializers.InitializerApplicator object at 0x7f33a1454590>
2020-06-10 11:07:27,459 - INFO - allennlp.nn.initializers - Initializing parameters
2020-06-10 11:07:27,460 - INFO - allennlp.nn.initializers - Done initializing parameters; the following parameters are using their default initialization from their code
2020-06-10 11:07:27,460 - INFO - allennlp.nn.initializers - _classification_layer.bias
2020-06-10 11:07:27,460 - INFO - allennlp.nn.initializers - _classification_layer.weight
2020-06-10 11:07:27,460 - INFO - allennlp.nn.initializers - _feedforward._linear_layers.0.bias
2020-06-10 11:07:27,460 - INFO - allennlp.nn.initializers - _feedforward._linear_layers.0.weight
2020-06-10 11:07:27,460 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.embeddings.LayerNorm.bias
2020-06-10 11:07:27,460 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.embeddings.LayerNorm.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.embeddings.position_embeddings.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.embeddings.token_type_embeddings.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.embeddings.word_embeddings.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.attention.output.LayerNorm.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.attention.output.LayerNorm.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.attention.output.dense.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.attention.output.dense.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.attention.self.key.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.attention.self.key.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.attention.self.query.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.attention.self.query.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.attention.self.value.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.attention.self.value.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.intermediate.dense.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.intermediate.dense.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.output.LayerNorm.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.output.LayerNorm.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.output.dense.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.0.output.dense.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.attention.output.LayerNorm.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.attention.output.LayerNorm.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.attention.output.dense.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.attention.output.dense.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.attention.self.key.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.attention.self.key.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.attention.self.query.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.attention.self.query.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.attention.self.value.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.attention.self.value.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.intermediate.dense.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.intermediate.dense.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.output.LayerNorm.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.output.LayerNorm.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.output.dense.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.1.output.dense.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.attention.output.LayerNorm.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.attention.output.LayerNorm.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.attention.output.dense.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.attention.output.dense.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.attention.self.key.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.attention.self.key.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.attention.self.query.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.attention.self.query.weight
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.attention.self.value.bias
2020-06-10 11:07:27,461 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.attention.self.value.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.intermediate.dense.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.intermediate.dense.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.output.LayerNorm.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.output.LayerNorm.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.output.dense.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.10.output.dense.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.attention.output.LayerNorm.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.attention.output.LayerNorm.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.attention.output.dense.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.attention.output.dense.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.attention.self.key.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.attention.self.key.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.attention.self.query.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.attention.self.query.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.attention.self.value.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.attention.self.value.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.intermediate.dense.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.intermediate.dense.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.output.LayerNorm.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.output.LayerNorm.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.output.dense.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.11.output.dense.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.attention.output.LayerNorm.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.attention.output.LayerNorm.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.attention.output.dense.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.attention.output.dense.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.attention.self.key.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.attention.self.key.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.attention.self.query.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.attention.self.query.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.attention.self.value.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.attention.self.value.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.intermediate.dense.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.intermediate.dense.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.output.LayerNorm.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.output.LayerNorm.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.output.dense.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.12.output.dense.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.attention.output.LayerNorm.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.attention.output.LayerNorm.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.attention.output.dense.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.attention.output.dense.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.attention.self.key.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.attention.self.key.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.attention.self.query.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.attention.self.query.weight
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.attention.self.value.bias
2020-06-10 11:07:27,462 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.attention.self.value.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.intermediate.dense.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.intermediate.dense.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.output.LayerNorm.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.output.LayerNorm.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.output.dense.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.13.output.dense.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.attention.output.LayerNorm.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.attention.output.LayerNorm.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.attention.output.dense.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.attention.output.dense.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.attention.self.key.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.attention.self.key.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.attention.self.query.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.attention.self.query.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.attention.self.value.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.attention.self.value.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.intermediate.dense.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.intermediate.dense.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.output.LayerNorm.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.output.LayerNorm.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.output.dense.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.14.output.dense.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.attention.output.LayerNorm.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.attention.output.LayerNorm.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.attention.output.dense.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.attention.output.dense.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.attention.self.key.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.attention.self.key.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.attention.self.query.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.attention.self.query.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.attention.self.value.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.attention.self.value.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.intermediate.dense.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.intermediate.dense.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.output.LayerNorm.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.output.LayerNorm.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.output.dense.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.15.output.dense.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.attention.output.LayerNorm.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.attention.output.LayerNorm.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.attention.output.dense.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.attention.output.dense.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.attention.self.key.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.attention.self.key.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.attention.self.query.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.attention.self.query.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.attention.self.value.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.attention.self.value.weight
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.intermediate.dense.bias
2020-06-10 11:07:27,463 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.intermediate.dense.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.output.LayerNorm.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.output.LayerNorm.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.output.dense.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.16.output.dense.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.attention.output.LayerNorm.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.attention.output.LayerNorm.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.attention.output.dense.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.attention.output.dense.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.attention.self.key.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.attention.self.key.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.attention.self.query.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.attention.self.query.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.attention.self.value.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.attention.self.value.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.intermediate.dense.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.intermediate.dense.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.output.LayerNorm.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.output.LayerNorm.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.output.dense.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.17.output.dense.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.attention.output.LayerNorm.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.attention.output.LayerNorm.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.attention.output.dense.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.attention.output.dense.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.attention.self.key.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.attention.self.key.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.attention.self.query.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.attention.self.query.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.attention.self.value.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.attention.self.value.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.intermediate.dense.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.intermediate.dense.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.output.LayerNorm.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.output.LayerNorm.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.output.dense.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.18.output.dense.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.attention.output.LayerNorm.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.attention.output.LayerNorm.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.attention.output.dense.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.attention.output.dense.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.attention.self.key.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.attention.self.key.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.attention.self.query.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.attention.self.query.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.attention.self.value.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.attention.self.value.weight
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.intermediate.dense.bias
2020-06-10 11:07:27,464 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.intermediate.dense.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.output.LayerNorm.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.output.LayerNorm.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.output.dense.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.19.output.dense.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.attention.output.LayerNorm.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.attention.output.LayerNorm.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.attention.output.dense.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.attention.output.dense.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.attention.self.key.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.attention.self.key.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.attention.self.query.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.attention.self.query.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.attention.self.value.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.attention.self.value.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.intermediate.dense.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.intermediate.dense.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.output.LayerNorm.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.output.LayerNorm.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.output.dense.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.2.output.dense.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.attention.output.LayerNorm.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.attention.output.LayerNorm.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.attention.output.dense.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.attention.output.dense.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.attention.self.key.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.attention.self.key.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.attention.self.query.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.attention.self.query.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.attention.self.value.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.attention.self.value.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.intermediate.dense.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.intermediate.dense.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.output.LayerNorm.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.output.LayerNorm.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.output.dense.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.20.output.dense.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.attention.output.LayerNorm.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.attention.output.LayerNorm.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.attention.output.dense.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.attention.output.dense.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.attention.self.key.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.attention.self.key.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.attention.self.query.bias
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.attention.self.query.weight
2020-06-10 11:07:27,465 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.attention.self.value.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.attention.self.value.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.intermediate.dense.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.intermediate.dense.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.output.LayerNorm.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.output.LayerNorm.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.output.dense.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.21.output.dense.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.attention.output.LayerNorm.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.attention.output.LayerNorm.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.attention.output.dense.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.attention.output.dense.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.attention.self.key.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.attention.self.key.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.attention.self.query.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.attention.self.query.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.attention.self.value.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.attention.self.value.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.intermediate.dense.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.intermediate.dense.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.output.LayerNorm.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.output.LayerNorm.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.output.dense.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.22.output.dense.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.attention.output.LayerNorm.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.attention.output.LayerNorm.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.attention.output.dense.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.attention.output.dense.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.attention.self.key.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.attention.self.key.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.attention.self.query.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.attention.self.query.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.attention.self.value.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.attention.self.value.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.intermediate.dense.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.intermediate.dense.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.output.LayerNorm.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.output.LayerNorm.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.output.dense.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.23.output.dense.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.attention.output.LayerNorm.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.attention.output.LayerNorm.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.attention.output.dense.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.attention.output.dense.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.attention.self.key.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.attention.self.key.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.attention.self.query.bias
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.attention.self.query.weight
2020-06-10 11:07:27,466 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.attention.self.value.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.attention.self.value.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.intermediate.dense.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.intermediate.dense.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.output.LayerNorm.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.output.LayerNorm.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.output.dense.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.3.output.dense.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.attention.output.LayerNorm.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.attention.output.LayerNorm.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.attention.output.dense.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.attention.output.dense.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.attention.self.key.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.attention.self.key.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.attention.self.query.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.attention.self.query.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.attention.self.value.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.attention.self.value.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.intermediate.dense.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.intermediate.dense.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.output.LayerNorm.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.output.LayerNorm.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.output.dense.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.4.output.dense.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.attention.output.LayerNorm.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.attention.output.LayerNorm.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.attention.output.dense.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.attention.output.dense.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.attention.self.key.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.attention.self.key.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.attention.self.query.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.attention.self.query.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.attention.self.value.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.attention.self.value.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.intermediate.dense.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.intermediate.dense.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.output.LayerNorm.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.output.LayerNorm.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.output.dense.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.5.output.dense.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.attention.output.LayerNorm.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.attention.output.LayerNorm.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.attention.output.dense.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.attention.output.dense.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.attention.self.key.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.attention.self.key.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.attention.self.query.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.attention.self.query.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.attention.self.value.bias
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.attention.self.value.weight
2020-06-10 11:07:27,467 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.intermediate.dense.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.intermediate.dense.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.output.LayerNorm.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.output.LayerNorm.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.output.dense.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.6.output.dense.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.attention.output.LayerNorm.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.attention.output.LayerNorm.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.attention.output.dense.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.attention.output.dense.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.attention.self.key.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.attention.self.key.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.attention.self.query.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.attention.self.query.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.attention.self.value.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.attention.self.value.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.intermediate.dense.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.intermediate.dense.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.output.LayerNorm.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.output.LayerNorm.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.output.dense.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.7.output.dense.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.attention.output.LayerNorm.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.attention.output.LayerNorm.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.attention.output.dense.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.attention.output.dense.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.attention.self.key.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.attention.self.key.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.attention.self.query.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.attention.self.query.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.attention.self.value.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.attention.self.value.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.intermediate.dense.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.intermediate.dense.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.output.LayerNorm.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.output.LayerNorm.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.output.dense.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.8.output.dense.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.attention.output.LayerNorm.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.attention.output.LayerNorm.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.attention.output.dense.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.attention.output.dense.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.attention.self.key.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.attention.self.key.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.attention.self.query.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.attention.self.query.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.attention.self.value.bias
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.attention.self.value.weight
2020-06-10 11:07:27,468 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.intermediate.dense.bias
2020-06-10 11:07:27,469 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.intermediate.dense.weight
2020-06-10 11:07:27,469 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.output.LayerNorm.bias
2020-06-10 11:07:27,469 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.output.LayerNorm.weight
2020-06-10 11:07:27,469 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.output.dense.bias
2020-06-10 11:07:27,469 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.encoder.layer.9.output.dense.weight
2020-06-10 11:07:27,469 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.pooler.dense.bias
2020-06-10 11:07:27,469 - INFO - allennlp.nn.initializers - _text_field_embedder.token_embedder_tokens.transformer_model.pooler.dense.weight
2020-06-10 11:07:28,436 - INFO - allennlp.common.params - dataset_reader.type = snli
2020-06-10 11:07:28,437 - INFO - allennlp.common.params - dataset_reader.lazy = False
2020-06-10 11:07:28,437 - INFO - allennlp.common.params - dataset_reader.cache_directory = None
2020-06-10 11:07:28,437 - INFO - allennlp.common.params - dataset_reader.max_instances = None
2020-06-10 11:07:28,437 - INFO - allennlp.common.params - dataset_reader.manual_distributed_sharding = False
2020-06-10 11:07:28,437 - INFO - allennlp.common.params - dataset_reader.tokenizer.type = pretrained_transformer
2020-06-10 11:07:28,437 - INFO - allennlp.common.params - dataset_reader.tokenizer.model_name = roberta-large
2020-06-10 11:07:28,437 - INFO - allennlp.common.params - dataset_reader.tokenizer.add_special_tokens = False
2020-06-10 11:07:28,437 - INFO - allennlp.common.params - dataset_reader.tokenizer.max_length = None
2020-06-10 11:07:28,437 - INFO - allennlp.common.params - dataset_reader.tokenizer.stride = 0
2020-06-10 11:07:28,437 - INFO - allennlp.common.params - dataset_reader.tokenizer.truncation_strategy = longest_first
2020-06-10 11:07:28,437 - INFO - allennlp.common.params - dataset_reader.tokenizer.tokenizer_kwargs = None
2020-06-10 11:07:28,756 - INFO - transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json from cache at /home/michaels/.cache/torch/transformers/c22e0b5bbb7c0cb93a87a2ae01263ae715b4c18d692b1740ce72cacaa99ad184.2d28da311092e99a05f9ee17520204614d60b0bfdb32f8a75644df7737b6a748
2020-06-10 11:07:28,756 - INFO - transformers.configuration_utils - Model config RobertaConfig {
"architectures": [
"RobertaForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"bos_token_id": 0,
"eos_token_id": 2,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 4096,
"layer_norm_eps": 1e-05,
"max_position_embeddings": 514,
"model_type": "roberta",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"pad_token_id": 1,
"type_vocab_size": 1,
"vocab_size": 50265
}
2020-06-10 11:07:29,405 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json from cache at /home/michaels/.cache/torch/transformers/1ae1f5b6e2b22b25ccc04c000bb79ca847aa226d0761536b011cf7e5868f0655.ef00af9e673c7160b4d41cfda1f48c5f4cba57d5142754525572a846a1ab1b9b
2020-06-10 11:07:29,405 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt from cache at /home/michaels/.cache/torch/transformers/f8f83199a6270d582d6245dc100e99c4155de81c9745c6248077018fe01abcfb.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda
2020-06-10 11:07:29,781 - INFO - transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json from cache at /home/michaels/.cache/torch/transformers/c22e0b5bbb7c0cb93a87a2ae01263ae715b4c18d692b1740ce72cacaa99ad184.2d28da311092e99a05f9ee17520204614d60b0bfdb32f8a75644df7737b6a748
2020-06-10 11:07:29,782 - INFO - transformers.configuration_utils - Model config RobertaConfig {
"architectures": [
"RobertaForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"bos_token_id": 0,
"eos_token_id": 2,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 4096,
"layer_norm_eps": 1e-05,
"max_position_embeddings": 514,
"model_type": "roberta",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"pad_token_id": 1,
"type_vocab_size": 1,
"vocab_size": 50265
}
2020-06-10 11:07:30,515 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json from cache at /home/michaels/.cache/torch/transformers/1ae1f5b6e2b22b25ccc04c000bb79ca847aa226d0761536b011cf7e5868f0655.ef00af9e673c7160b4d41cfda1f48c5f4cba57d5142754525572a846a1ab1b9b
2020-06-10 11:07:30,516 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt from cache at /home/michaels/.cache/torch/transformers/f8f83199a6270d582d6245dc100e99c4155de81c9745c6248077018fe01abcfb.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda
2020-06-10 11:07:30,652 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.type = pretrained_transformer
2020-06-10 11:07:30,653 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.token_min_padding_length = 0
2020-06-10 11:07:30,653 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.model_name = roberta-large
2020-06-10 11:07:30,653 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.namespace = tags
2020-06-10 11:07:30,653 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.max_length = 512
2020-06-10 11:07:30,970 - INFO - transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json from cache at /home/michaels/.cache/torch/transformers/c22e0b5bbb7c0cb93a87a2ae01263ae715b4c18d692b1740ce72cacaa99ad184.2d28da311092e99a05f9ee17520204614d60b0bfdb32f8a75644df7737b6a748
2020-06-10 11:07:30,971 - INFO - transformers.configuration_utils - Model config RobertaConfig {
"architectures": [
"RobertaForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"bos_token_id": 0,
"eos_token_id": 2,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 4096,
"layer_norm_eps": 1e-05,
"max_position_embeddings": 514,
"model_type": "roberta",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"pad_token_id": 1,
"type_vocab_size": 1,
"vocab_size": 50265
}
2020-06-10 11:07:31,609 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json from cache at /home/michaels/.cache/torch/transformers/1ae1f5b6e2b22b25ccc04c000bb79ca847aa226d0761536b011cf7e5868f0655.ef00af9e673c7160b4d41cfda1f48c5f4cba57d5142754525572a846a1ab1b9b
2020-06-10 11:07:31,609 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt from cache at /home/michaels/.cache/torch/transformers/f8f83199a6270d582d6245dc100e99c4155de81c9745c6248077018fe01abcfb.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda
2020-06-10 11:07:31,993 - INFO - transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json from cache at /home/michaels/.cache/torch/transformers/c22e0b5bbb7c0cb93a87a2ae01263ae715b4c18d692b1740ce72cacaa99ad184.2d28da311092e99a05f9ee17520204614d60b0bfdb32f8a75644df7737b6a748
2020-06-10 11:07:31,994 - INFO - transformers.configuration_utils - Model config RobertaConfig {
"architectures": [
"RobertaForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"bos_token_id": 0,
"eos_token_id": 2,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 4096,
"layer_norm_eps": 1e-05,
"max_position_embeddings": 514,
"model_type": "roberta",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"pad_token_id": 1,
"type_vocab_size": 1,
"vocab_size": 50265
}
2020-06-10 11:07:32,673 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json from cache at /home/michaels/.cache/torch/transformers/1ae1f5b6e2b22b25ccc04c000bb79ca847aa226d0761536b011cf7e5868f0655.ef00af9e673c7160b4d41cfda1f48c5f4cba57d5142754525572a846a1ab1b9b
2020-06-10 11:07:32,674 - INFO - transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt from cache at /home/michaels/.cache/torch/transformers/f8f83199a6270d582d6245dc100e99c4155de81c9745c6248077018fe01abcfb.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda
2020-06-10 11:07:32,790 - INFO - allennlp.common.params - dataset_reader.combine_input_fields = None
input 0: {"hypothesis": "Two women are sitting on a blanket near some rocks talking about politics.", "premise": "Two women are wandering along the shore drinking iced tea."}
prediction: {"logits": [-3.233438014984131, 4.515608787536621, -1.2487057447433472], "probs": [0.00042962009320035577, 0.9964439272880554, 0.003126387717202306], "token_ids": [0, 1596, 390, 32, 26884, 552, 5, 8373, 4835, 1437, 12646, 6845, 4, 2, 2, 1596, 390, 32, 2828, 15, 10, 14165, 583, 103, 10889, 1686, 59, 2302, 4, 2], "label": "contradiction", "tokens": ["<s>", "\u0120Two", "\u0120women", "\u0120are", "\u0120wandering", "\u0120along", "\u0120the", "\u0120shore", "\u0120drinking", "\u0120", "iced", "\u0120tea", ".", "</s>", "</s>", "\u0120Two", "\u0120women", "\u0120are", "\u0120sitting", "\u0120on", "\u0120a", "\u0120blanket", "\u0120near", "\u0120some", "\u0120rocks", "\u0120talking", "\u0120about", "\u0120politics", ".", "</s>"]}
2020-06-10 11:07:33,242 - INFO - allennlp.models.archival - removing temporary unarchived model dir at /tmp/tmpeqwhldni
@dirkgr do we want all the /u
in the output? E.g. \u0120Two
Fixed in demo.