allenai / scibert

A BERT model for scientific text.
https://arxiv.org/abs/1903.10676
Apache License 2.0
1.49k stars 216 forks source link

NER task reproduce: key error 'text' #31

Closed haowenke closed 5 years ago

haowenke commented 5 years ago

Hi, First of all, thanks for this great work. I am trying to reproduce the ner task mentioned on your paper. But some key error raised when I try to run following command. ./train_allennlp_local.sh outputs

2019-04-06 23:48:42,209 - INFO - allennlp.common.params - random_seed = 13270
2019-04-06 23:48:42,209 - INFO - allennlp.common.params - numpy_seed = 132
2019-04-06 23:48:42,209 - INFO - allennlp.common.params - pytorch_seed = 1327
2019-04-06 23:48:42,219 - INFO - allennlp.common.checks - Pytorch version: 1.0.1.post2
2019-04-06 23:48:42,221 - INFO - allennlp.common.params - evaluate_on_test = True
2019-04-06 23:48:42,221 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'coding_scheme': 'BIOUL', 'tag_label': 'ner', 'token_indexers': {'bert': {'do_lowercase': 'true', 'pretrained_model': 'scibert_scivocab_uncased/vocab.txt', 'type': 'bert-pretrained', 'use_starting_offsets': True}, 'token_characters': {'min_padding_length': 3, 'type': 'characters'}}, 'type': 'conll2003'} and extras set()
2019-04-06 23:48:42,222 - INFO - allennlp.common.params - dataset_reader.type = conll2003
2019-04-06 23:48:42,222 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.conll2003.Conll2003DatasetReader'> from params {'coding_scheme': 'BIOUL', 'tag_label': 'ner', 'token_indexers': {'bert': {'do_lowercase': 'true', 'pretrained_model': 'scibert_scivocab_uncased/vocab.txt', 'type': 'bert-pretrained', 'use_starting_offsets': True}, 'token_characters': {'min_padding_length': 3, 'type': 'characters'}}} and extras set()
2019-04-06 23:48:42,222 - INFO - allennlp.common.from_params - instantiating class allennlp.data.token_indexers.token_indexer.TokenIndexer from params {'do_lowercase': 'true', 'pretrained_model': 'scibert_scivocab_uncased/vocab.txt', 'type': 'bert-pretrained', 'use_starting_offsets': True} and extras set()
2019-04-06 23:48:42,222 - INFO - allennlp.common.params - dataset_reader.token_indexers.bert.type = bert-pretrained
2019-04-06 23:48:42,223 - INFO - allennlp.common.from_params - instantiating class allennlp.data.token_indexers.wordpiece_indexer.PretrainedBertIndexer from params {'do_lowercase': 'true', 'pretrained_model': 'scibert_scivocab_uncased/vocab.txt', 'use_starting_offsets': True} and extras set()
2019-04-06 23:48:42,223 - INFO - allennlp.common.params - dataset_reader.token_indexers.bert.pretrained_model = scibert_scivocab_uncased/vocab.txt
2019-04-06 23:48:42,223 - INFO - allennlp.common.params - dataset_reader.token_indexers.bert.use_starting_offsets = True
2019-04-06 23:48:42,223 - INFO - allennlp.common.params - dataset_reader.token_indexers.bert.do_lowercase = true
2019-04-06 23:48:42,223 - INFO - allennlp.common.params - dataset_reader.token_indexers.bert.never_lowercase = None
2019-04-06 23:48:42,223 - INFO - allennlp.common.params - dataset_reader.token_indexers.bert.max_pieces = 512
2019-04-06 23:48:42,224 - INFO - pytorch_pretrained_bert.tokenization - loading vocabulary file scibert_scivocab_uncased/vocab.txt
2019-04-06 23:48:42,279 - INFO - allennlp.common.from_params - instantiating class allennlp.data.token_indexers.token_indexer.TokenIndexer from params {'min_padding_length': 3, 'type': 'characters'} and extras set()
2019-04-06 23:48:42,280 - INFO - allennlp.common.params - dataset_reader.token_indexers.token_characters.type = characters
2019-04-06 23:48:42,280 - INFO - allennlp.common.from_params - instantiating class allennlp.data.token_indexers.token_characters_indexer.TokenCharactersIndexer from params {'min_padding_length': 3} and extras set()
2019-04-06 23:48:42,280 - INFO - allennlp.common.params - dataset_reader.token_indexers.token_characters.namespace = token_characters
2019-04-06 23:48:42,280 - INFO - allennlp.common.params - dataset_reader.token_indexers.token_characters.start_tokens = None
2019-04-06 23:48:42,280 - INFO - allennlp.common.params - dataset_reader.token_indexers.token_characters.end_tokens = None
2019-04-06 23:48:42,281 - INFO - allennlp.common.params - dataset_reader.token_indexers.token_characters.min_padding_length = 3
2019-04-06 23:48:42,281 - INFO - allennlp.common.params - dataset_reader.tag_label = ner
2019-04-06 23:48:42,281 - INFO - allennlp.common.params - dataset_reader.feature_labels = ()
2019-04-06 23:48:42,281 - INFO - allennlp.common.params - dataset_reader.lazy = False
2019-04-06 23:48:42,281 - INFO - allennlp.common.params - dataset_reader.coding_scheme = BIOUL
2019-04-06 23:48:42,281 - INFO - allennlp.common.params - dataset_reader.label_namespace = labels
2019-04-06 23:48:42,281 - INFO - allennlp.common.params - validation_dataset_reader = None
2019-04-06 23:48:42,282 - INFO - allennlp.common.params - train_data_path = data/ner/NCBI-disease/train.txt
2019-04-06 23:48:42,282 - INFO - allennlp.training.util - Reading training data from data/ner/NCBI-disease/train.txt
0it [00:00, ?it/s]2019-04-06 23:48:42,282 - INFO - allennlp.data.dataset_readers.conll2003 - Reading instances from lines in file at: data/ner/NCBI-disease/train.txt
5424it [00:01, 4663.50it/s]
2019-04-06 23:48:43,446 - INFO - allennlp.common.params - validation_data_path = data/ner/NCBI-disease/dev.txt
2019-04-06 23:48:43,446 - INFO - allennlp.training.util - Reading validation data from data/ner/NCBI-disease/dev.txt
0it [00:00, ?it/s]2019-04-06 23:48:43,446 - INFO - allennlp.data.dataset_readers.conll2003 - Reading instances from lines in file at: data/ner/NCBI-disease/dev.txt
923it [00:00, 3032.06it/s]
2019-04-06 23:48:43,751 - INFO - allennlp.common.params - test_data_path = data/ner/NCBI-disease/test.txt
2019-04-06 23:48:43,751 - INFO - allennlp.training.util - Reading test data from data/ner/NCBI-disease/test.txt
0it [00:00, ?it/s]2019-04-06 23:48:43,751 - INFO - allennlp.data.dataset_readers.conll2003 - Reading instances from lines in file at: data/ner/NCBI-disease/test.txt
940it [00:00, 5391.95it/s]
2019-04-06 23:48:43,926 - INFO - allennlp.training.trainer - From dataset instances, test, validation, train will be considered for vocabulary creation.
2019-04-06 23:48:43,926 - INFO - allennlp.common.params - vocabulary.type = None
2019-04-06 23:48:43,926 - INFO - allennlp.common.params - vocabulary.extend = False
2019-04-06 23:48:43,926 - INFO - allennlp.common.params - vocabulary.directory_path = None
2019-04-06 23:48:43,926 - INFO - allennlp.common.params - vocabulary.min_count = None
2019-04-06 23:48:43,926 - INFO - allennlp.common.params - vocabulary.max_vocab_size = None
2019-04-06 23:48:43,927 - INFO - allennlp.common.params - vocabulary.non_padded_namespaces = ('*tags', '*labels')
2019-04-06 23:48:43,927 - INFO - allennlp.common.params - vocabulary.min_pretrained_embeddings = None
2019-04-06 23:48:43,927 - INFO - allennlp.common.params - vocabulary.only_include_pretrained_words = False
2019-04-06 23:48:43,927 - INFO - allennlp.common.params - vocabulary.tokens_to_add = None
2019-04-06 23:48:43,927 - INFO - allennlp.data.vocabulary - Fitting token dictionary from dataset.
7287it [00:02, 3373.96it/s]
2019-04-06 23:48:46,088 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'calculate_span_f1': True, 'constrain_crf_decoding': True, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'dropout': 0.5, 'hidden_size': 200, 'input_size': 896, 'num_layers': 2, 'type': 'lstm'}, 'include_start_end_transitions': False, 'label_encoding': 'BIOUL', 'text_field_embedder': {'allow_unmatched_keys': True, 'embedder_to_indexer_map': {'bert': ['bert', 'bert-offsets'], 'token_characters': ['token_characters']}, 'token_embedders': {'bert': {'pretrained_model': 'scibert_scivocab_uncased/weights.tar.gz', 'type': 'bert-pretrained'}, 'token_characters': {'embedding': {'embedding_dim': 16}, 'encoder': {'conv_layer_activation': 'relu', 'embedding_dim': 16, 'ngram_filter_sizes': [3], 'num_filters': 128, 'type': 'cnn'}, 'type': 'character_encoding'}}}, 'type': 'crf_tagger'} and extras {'vocab'}
2019-04-06 23:48:46,088 - INFO - allennlp.common.params - model.type = crf_tagger
2019-04-06 23:48:46,088 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.crf_tagger.CrfTagger'> from params {'calculate_span_f1': True, 'constrain_crf_decoding': True, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'dropout': 0.5, 'hidden_size': 200, 'input_size': 896, 'num_layers': 2, 'type': 'lstm'}, 'include_start_end_transitions': False, 'label_encoding': 'BIOUL', 'text_field_embedder': {'allow_unmatched_keys': True, 'embedder_to_indexer_map': {'bert': ['bert', 'bert-offsets'], 'token_characters': ['token_characters']}, 'token_embedders': {'bert': {'pretrained_model': 'scibert_scivocab_uncased/weights.tar.gz', 'type': 'bert-pretrained'}, 'token_characters': {'embedding': {'embedding_dim': 16}, 'encoder': {'conv_layer_activation': 'relu', 'embedding_dim': 16, 'ngram_filter_sizes': [3], 'num_filters': 128, 'type': 'cnn'}, 'type': 'character_encoding'}}}} and extras {'vocab'}
2019-04-06 23:48:46,089 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'allow_unmatched_keys': True, 'embedder_to_indexer_map': {'bert': ['bert', 'bert-offsets'], 'token_characters': ['token_characters']}, 'token_embedders': {'bert': {'pretrained_model': 'scibert_scivocab_uncased/weights.tar.gz', 'type': 'bert-pretrained'}, 'token_characters': {'embedding': {'embedding_dim': 16}, 'encoder': {'conv_layer_activation': 'relu', 'embedding_dim': 16, 'ngram_filter_sizes': [3], 'num_filters': 128, 'type': 'cnn'}, 'type': 'character_encoding'}}} and extras {'vocab'}
2019-04-06 23:48:46,089 - INFO - allennlp.common.params - model.text_field_embedder.type = basic
2019-04-06 23:48:46,089 - INFO - allennlp.common.params - model.text_field_embedder.allow_unmatched_keys = True
2019-04-06 23:48:46,089 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'pretrained_model': 'scibert_scivocab_uncased/weights.tar.gz', 'type': 'bert-pretrained'} and extras {'vocab'}
2019-04-06 23:48:46,090 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.bert.type = bert-pretrained
2019-04-06 23:48:46,090 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.bert_token_embedder.PretrainedBertEmbedder'> from params {'pretrained_model': 'scibert_scivocab_uncased/weights.tar.gz'} and extras {'vocab'}
2019-04-06 23:48:46,090 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.bert.pretrained_model = scibert_scivocab_uncased/weights.tar.gz
2019-04-06 23:48:46,090 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.bert.requires_grad = False
2019-04-06 23:48:46,090 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.bert.top_layer_only = False
2019-04-06 23:48:46,091 - INFO - pytorch_pretrained_bert.modeling - loading archive file scibert_scivocab_uncased/weights.tar.gz
2019-04-06 23:48:46,092 - INFO - pytorch_pretrained_bert.modeling - extracting archive file scibert_scivocab_uncased/weights.tar.gz to temp dir /tmp/tmpe17shibm
2019-04-06 23:48:52,135 - INFO - pytorch_pretrained_bert.modeling - Model config {
  "attention_probs_dropout_prob": 0.1,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "max_position_embeddings": 512,
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "type_vocab_size": 2,
  "vocab_size": 31090
}

2019-04-06 23:48:58,080 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding': {'embedding_dim': 16}, 'encoder': {'conv_layer_activation': 'relu', 'embedding_dim': 16, 'ngram_filter_sizes': [3], 'num_filters': 128, 'type': 'cnn'}, 'type': 'character_encoding'} and extras {'vocab'}
2019-04-06 23:48:58,081 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.type = character_encoding
2019-04-06 23:48:58,081 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.embedding.num_embeddings = None
2019-04-06 23:48:58,081 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.embedding.vocab_namespace = token_characters
2019-04-06 23:48:58,081 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.embedding.embedding_dim = 16
2019-04-06 23:48:58,081 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.embedding.pretrained_file = None
2019-04-06 23:48:58,081 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.embedding.projection_dim = None
2019-04-06 23:48:58,082 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.embedding.trainable = True
2019-04-06 23:48:58,082 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.embedding.padding_index = None
2019-04-06 23:48:58,082 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.embedding.max_norm = None
2019-04-06 23:48:58,082 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.embedding.norm_type = 2.0
2019-04-06 23:48:58,082 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.embedding.scale_grad_by_freq = False
2019-04-06 23:48:58,082 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.embedding.sparse = False
2019-04-06 23:48:58,083 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder'> from params {'conv_layer_activation': 'relu', 'embedding_dim': 16, 'ngram_filter_sizes': [3], 'num_filters': 128, 'type': 'cnn'} and extras set()
2019-04-06 23:48:58,083 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.encoder.type = cnn
2019-04-06 23:48:58,083 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2vec_encoders.cnn_encoder.CnnEncoder'> from params {'conv_layer_activation': 'relu', 'embedding_dim': 16, 'ngram_filter_sizes': [3], 'num_filters': 128} and extras set()
2019-04-06 23:48:58,083 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.encoder.embedding_dim = 16
2019-04-06 23:48:58,083 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.encoder.num_filters = 128
2019-04-06 23:48:58,084 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.encoder.ngram_filter_sizes = [3]
2019-04-06 23:48:58,084 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.encoder.conv_layer_activation = relu
2019-04-06 23:48:58,084 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.encoder.output_dim = None
2019-04-06 23:48:58,087 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.token_characters.dropout = 0.0
2019-04-06 23:48:58,088 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'dropout': 0.5, 'hidden_size': 200, 'input_size': 896, 'num_layers': 2, 'type': 'lstm'} and extras {'vocab'}
2019-04-06 23:48:58,088 - INFO - allennlp.common.params - model.encoder.type = lstm
2019-04-06 23:48:58,088 - INFO - allennlp.common.params - model.encoder.batch_first = True
2019-04-06 23:48:58,089 - INFO - allennlp.common.params - model.encoder.stateful = False
2019-04-06 23:48:58,089 - INFO - allennlp.common.params - Converting Params object to dict; logging of default values will not occur when dictionary parameters are used subsequently.
2019-04-06 23:48:58,089 - INFO - allennlp.common.params - CURRENTLY DEFINED PARAMETERS: 
2019-04-06 23:48:58,089 - INFO - allennlp.common.params - model.encoder.bidirectional = True
2019-04-06 23:48:58,089 - INFO - allennlp.common.params - model.encoder.dropout = 0.5
2019-04-06 23:48:58,089 - INFO - allennlp.common.params - model.encoder.hidden_size = 200
2019-04-06 23:48:58,089 - INFO - allennlp.common.params - model.encoder.input_size = 896
2019-04-06 23:48:58,089 - INFO - allennlp.common.params - model.encoder.num_layers = 2
2019-04-06 23:48:58,089 - INFO - allennlp.common.params - model.encoder.batch_first = True
2019-04-06 23:48:58,128 - INFO - allennlp.common.params - model.label_namespace = labels
2019-04-06 23:48:58,128 - INFO - allennlp.common.params - model.label_encoding = BIOUL
2019-04-06 23:48:58,129 - INFO - allennlp.common.params - model.include_start_end_transitions = False
2019-04-06 23:48:58,129 - INFO - allennlp.common.params - model.constrain_crf_decoding = True
2019-04-06 23:48:58,129 - INFO - allennlp.common.params - model.calculate_span_f1 = True
2019-04-06 23:48:58,129 - INFO - allennlp.common.params - model.dropout = 0.5
2019-04-06 23:48:58,129 - INFO - allennlp.common.params - model.verbose_metrics = False
2019-04-06 23:48:58,131 - INFO - allennlp.nn.initializers - Initializing parameters
2019-04-06 23:48:58,132 - INFO - allennlp.nn.initializers - Done initializing parameters; the following parameters are using their default initialization from their code
2019-04-06 23:48:58,133 - INFO - allennlp.nn.initializers -    crf._constraint_mask
2019-04-06 23:48:58,133 - INFO - allennlp.nn.initializers -    crf.transitions
2019-04-06 23:48:58,133 - INFO - allennlp.nn.initializers -    encoder._module.bias_hh_l0
2019-04-06 23:48:58,133 - INFO - allennlp.nn.initializers -    encoder._module.bias_hh_l0_reverse
2019-04-06 23:48:58,133 - INFO - allennlp.nn.initializers -    encoder._module.bias_hh_l1
2019-04-06 23:48:58,133 - INFO - allennlp.nn.initializers -    encoder._module.bias_hh_l1_reverse
2019-04-06 23:48:58,134 - INFO - allennlp.nn.initializers -    encoder._module.bias_ih_l0
2019-04-06 23:48:58,134 - INFO - allennlp.nn.initializers -    encoder._module.bias_ih_l0_reverse
2019-04-06 23:48:58,134 - INFO - allennlp.nn.initializers -    encoder._module.bias_ih_l1
2019-04-06 23:48:58,134 - INFO - allennlp.nn.initializers -    encoder._module.bias_ih_l1_reverse
2019-04-06 23:48:58,134 - INFO - allennlp.nn.initializers -    encoder._module.weight_hh_l0
2019-04-06 23:48:58,134 - INFO - allennlp.nn.initializers -    encoder._module.weight_hh_l0_reverse
2019-04-06 23:48:58,134 - INFO - allennlp.nn.initializers -    encoder._module.weight_hh_l1
2019-04-06 23:48:58,134 - INFO - allennlp.nn.initializers -    encoder._module.weight_hh_l1_reverse
2019-04-06 23:48:58,134 - INFO - allennlp.nn.initializers -    encoder._module.weight_ih_l0
2019-04-06 23:48:58,134 - INFO - allennlp.nn.initializers -    encoder._module.weight_ih_l0_reverse
2019-04-06 23:48:58,135 - INFO - allennlp.nn.initializers -    encoder._module.weight_ih_l1
2019-04-06 23:48:58,135 - INFO - allennlp.nn.initializers -    encoder._module.weight_ih_l1_reverse
2019-04-06 23:48:58,135 - INFO - allennlp.nn.initializers -    tag_projection_layer._module.bias
2019-04-06 23:48:58,135 - INFO - allennlp.nn.initializers -    tag_projection_layer._module.weight
2019-04-06 23:48:58,135 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert._scalar_mix.gamma
2019-04-06 23:48:58,135 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.0
2019-04-06 23:48:58,135 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.1
2019-04-06 23:48:58,135 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.10
2019-04-06 23:48:58,135 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.11
2019-04-06 23:48:58,136 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.2
2019-04-06 23:48:58,136 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.3
2019-04-06 23:48:58,136 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.4
2019-04-06 23:48:58,136 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.5
2019-04-06 23:48:58,136 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.6
2019-04-06 23:48:58,136 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.7
2019-04-06 23:48:58,136 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.8
2019-04-06 23:48:58,136 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.9
2019-04-06 23:48:58,136 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.embeddings.LayerNorm.bias
2019-04-06 23:48:58,137 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.embeddings.LayerNorm.weight
2019-04-06 23:48:58,137 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.embeddings.position_embeddings.weight
2019-04-06 23:48:58,137 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.embeddings.token_type_embeddings.weight
2019-04-06 23:48:58,137 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.embeddings.word_embeddings.weight
2019-04-06 23:48:58,137 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.output.LayerNorm.bias
2019-04-06 23:48:58,137 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.output.LayerNorm.weight
2019-04-06 23:48:58,137 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.output.dense.bias
2019-04-06 23:48:58,137 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.output.dense.weight
2019-04-06 23:48:58,137 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.key.bias
2019-04-06 23:48:58,138 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.key.weight
2019-04-06 23:48:58,138 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.query.bias
2019-04-06 23:48:58,138 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.query.weight
2019-04-06 23:48:58,138 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.value.bias
2019-04-06 23:48:58,138 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.value.weight
2019-04-06 23:48:58,138 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.intermediate.dense.bias
2019-04-06 23:48:58,138 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.intermediate.dense.weight
2019-04-06 23:48:58,138 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.output.LayerNorm.bias
2019-04-06 23:48:58,138 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.output.LayerNorm.weight
2019-04-06 23:48:58,139 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.output.dense.bias
2019-04-06 23:48:58,139 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.output.dense.weight
2019-04-06 23:48:58,139 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.output.LayerNorm.bias
2019-04-06 23:48:58,139 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.output.LayerNorm.weight
2019-04-06 23:48:58,139 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.output.dense.bias
2019-04-06 23:48:58,139 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.output.dense.weight
2019-04-06 23:48:58,139 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.key.bias
2019-04-06 23:48:58,139 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.key.weight
2019-04-06 23:48:58,139 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.query.bias
2019-04-06 23:48:58,140 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.query.weight
2019-04-06 23:48:58,140 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.value.bias
2019-04-06 23:48:58,140 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.value.weight
2019-04-06 23:48:58,140 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.intermediate.dense.bias
2019-04-06 23:48:58,140 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.intermediate.dense.weight
2019-04-06 23:48:58,140 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.output.LayerNorm.bias
2019-04-06 23:48:58,140 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.output.LayerNorm.weight
2019-04-06 23:48:58,140 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.output.dense.bias
2019-04-06 23:48:58,140 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.output.dense.weight
2019-04-06 23:48:58,140 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.output.LayerNorm.bias
2019-04-06 23:48:58,141 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.output.LayerNorm.weight
2019-04-06 23:48:58,141 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.output.dense.bias
2019-04-06 23:48:58,141 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.output.dense.weight
2019-04-06 23:48:58,141 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.key.bias
2019-04-06 23:48:58,141 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.key.weight
2019-04-06 23:48:58,141 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.query.bias
2019-04-06 23:48:58,141 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.query.weight
2019-04-06 23:48:58,141 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.value.bias
2019-04-06 23:48:58,141 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.value.weight
2019-04-06 23:48:58,142 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.intermediate.dense.bias
2019-04-06 23:48:58,142 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.intermediate.dense.weight
2019-04-06 23:48:58,142 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.output.LayerNorm.bias
2019-04-06 23:48:58,142 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.output.LayerNorm.weight
2019-04-06 23:48:58,142 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.output.dense.bias
2019-04-06 23:48:58,142 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.output.dense.weight
2019-04-06 23:48:58,142 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.output.LayerNorm.bias
2019-04-06 23:48:58,142 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.output.LayerNorm.weight
2019-04-06 23:48:58,142 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.output.dense.bias
2019-04-06 23:48:58,143 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.output.dense.weight
2019-04-06 23:48:58,143 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.key.bias
2019-04-06 23:48:58,143 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.key.weight
2019-04-06 23:48:58,143 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.query.bias
2019-04-06 23:48:58,143 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.query.weight
2019-04-06 23:48:58,143 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.value.bias
2019-04-06 23:48:58,143 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.value.weight
2019-04-06 23:48:58,143 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.intermediate.dense.bias
2019-04-06 23:48:58,143 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.intermediate.dense.weight
2019-04-06 23:48:58,144 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.output.LayerNorm.bias
2019-04-06 23:48:58,144 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.output.LayerNorm.weight
2019-04-06 23:48:58,144 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.output.dense.bias
2019-04-06 23:48:58,144 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.output.dense.weight
2019-04-06 23:48:58,144 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.output.LayerNorm.bias
2019-04-06 23:48:58,144 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.output.LayerNorm.weight
2019-04-06 23:48:58,144 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.output.dense.bias
2019-04-06 23:48:58,144 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.output.dense.weight
2019-04-06 23:48:58,144 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.key.bias
2019-04-06 23:48:58,145 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.key.weight
2019-04-06 23:48:58,145 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.query.bias
2019-04-06 23:48:58,145 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.query.weight
2019-04-06 23:48:58,145 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.value.bias
2019-04-06 23:48:58,145 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.value.weight
2019-04-06 23:48:58,145 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.intermediate.dense.bias
2019-04-06 23:48:58,145 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.intermediate.dense.weight
2019-04-06 23:48:58,145 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.output.LayerNorm.bias
2019-04-06 23:48:58,145 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.output.LayerNorm.weight
2019-04-06 23:48:58,146 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.output.dense.bias
2019-04-06 23:48:58,146 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.output.dense.weight
2019-04-06 23:48:58,146 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.output.LayerNorm.bias
2019-04-06 23:48:58,146 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.output.LayerNorm.weight
2019-04-06 23:48:58,146 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.output.dense.bias
2019-04-06 23:48:58,146 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.output.dense.weight
2019-04-06 23:48:58,146 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.key.bias
2019-04-06 23:48:58,146 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.key.weight
2019-04-06 23:48:58,146 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.query.bias
2019-04-06 23:48:58,146 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.query.weight
2019-04-06 23:48:58,147 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.value.bias
2019-04-06 23:48:58,147 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.value.weight
2019-04-06 23:48:58,147 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.intermediate.dense.bias
2019-04-06 23:48:58,147 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.intermediate.dense.weight
2019-04-06 23:48:58,147 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.output.LayerNorm.bias
2019-04-06 23:48:58,147 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.output.LayerNorm.weight
2019-04-06 23:48:58,147 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.output.dense.bias
2019-04-06 23:48:58,147 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.output.dense.weight
2019-04-06 23:48:58,147 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.output.LayerNorm.bias
2019-04-06 23:48:58,148 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.output.LayerNorm.weight
2019-04-06 23:48:58,148 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.output.dense.bias
2019-04-06 23:48:58,148 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.output.dense.weight
2019-04-06 23:48:58,148 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.key.bias
2019-04-06 23:48:58,148 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.key.weight
2019-04-06 23:48:58,148 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.query.bias
2019-04-06 23:48:58,148 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.query.weight
2019-04-06 23:48:58,148 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.value.bias
2019-04-06 23:48:58,148 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.value.weight
2019-04-06 23:48:58,149 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.intermediate.dense.bias
2019-04-06 23:48:58,149 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.intermediate.dense.weight
2019-04-06 23:48:58,149 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.output.LayerNorm.bias
2019-04-06 23:48:58,149 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.output.LayerNorm.weight
2019-04-06 23:48:58,149 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.output.dense.bias
2019-04-06 23:48:58,149 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.output.dense.weight
2019-04-06 23:48:58,149 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.output.LayerNorm.bias
2019-04-06 23:48:58,149 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.output.LayerNorm.weight
2019-04-06 23:48:58,149 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.output.dense.bias
2019-04-06 23:48:58,150 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.output.dense.weight
2019-04-06 23:48:58,150 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.key.bias
2019-04-06 23:48:58,150 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.key.weight
2019-04-06 23:48:58,150 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.query.bias
2019-04-06 23:48:58,150 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.query.weight
2019-04-06 23:48:58,150 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.value.bias
2019-04-06 23:48:58,150 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.value.weight
2019-04-06 23:48:58,150 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.intermediate.dense.bias
2019-04-06 23:48:58,150 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.intermediate.dense.weight
2019-04-06 23:48:58,150 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.output.LayerNorm.bias
2019-04-06 23:48:58,151 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.output.LayerNorm.weight
2019-04-06 23:48:58,151 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.output.dense.bias
2019-04-06 23:48:58,151 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.output.dense.weight
2019-04-06 23:48:58,151 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.output.LayerNorm.bias
2019-04-06 23:48:58,151 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.output.LayerNorm.weight
2019-04-06 23:48:58,151 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.output.dense.bias
2019-04-06 23:48:58,151 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.output.dense.weight
2019-04-06 23:48:58,151 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.key.bias
2019-04-06 23:48:58,152 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.key.weight
2019-04-06 23:48:58,152 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.query.bias
2019-04-06 23:48:58,152 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.query.weight
2019-04-06 23:48:58,152 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.value.bias
2019-04-06 23:48:58,152 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.value.weight
2019-04-06 23:48:58,152 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.intermediate.dense.bias
2019-04-06 23:48:58,152 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.intermediate.dense.weight
2019-04-06 23:48:58,152 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.output.LayerNorm.bias
2019-04-06 23:48:58,152 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.output.LayerNorm.weight
2019-04-06 23:48:58,152 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.output.dense.bias
2019-04-06 23:48:58,153 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.output.dense.weight
2019-04-06 23:48:58,153 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.output.LayerNorm.bias
2019-04-06 23:48:58,153 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.output.LayerNorm.weight
2019-04-06 23:48:58,153 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.output.dense.bias
2019-04-06 23:48:58,153 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.output.dense.weight
2019-04-06 23:48:58,153 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.key.bias
2019-04-06 23:48:58,153 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.key.weight
2019-04-06 23:48:58,153 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.query.bias
2019-04-06 23:48:58,153 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.query.weight
2019-04-06 23:48:58,154 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.value.bias
2019-04-06 23:48:58,154 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.value.weight
2019-04-06 23:48:58,154 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.intermediate.dense.bias
2019-04-06 23:48:58,154 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.intermediate.dense.weight
2019-04-06 23:48:58,154 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.output.LayerNorm.bias
2019-04-06 23:48:58,154 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.output.LayerNorm.weight
2019-04-06 23:48:58,154 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.output.dense.bias
2019-04-06 23:48:58,154 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.output.dense.weight
2019-04-06 23:48:58,154 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.output.LayerNorm.bias
2019-04-06 23:48:58,155 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.output.LayerNorm.weight
2019-04-06 23:48:58,155 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.output.dense.bias
2019-04-06 23:48:58,155 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.output.dense.weight
2019-04-06 23:48:58,155 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.key.bias
2019-04-06 23:48:58,155 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.key.weight
2019-04-06 23:48:58,155 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.query.bias
2019-04-06 23:48:58,155 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.query.weight
2019-04-06 23:48:58,155 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.value.bias
2019-04-06 23:48:58,155 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.value.weight
2019-04-06 23:48:58,156 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.intermediate.dense.bias
2019-04-06 23:48:58,156 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.intermediate.dense.weight
2019-04-06 23:48:58,156 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.output.LayerNorm.bias
2019-04-06 23:48:58,156 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.output.LayerNorm.weight
2019-04-06 23:48:58,156 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.output.dense.bias
2019-04-06 23:48:58,156 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.output.dense.weight
2019-04-06 23:48:58,156 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.output.LayerNorm.bias
2019-04-06 23:48:58,156 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.output.LayerNorm.weight
2019-04-06 23:48:58,156 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.output.dense.bias
2019-04-06 23:48:58,157 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.output.dense.weight
2019-04-06 23:48:58,157 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.key.bias
2019-04-06 23:48:58,157 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.key.weight
2019-04-06 23:48:58,157 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.query.bias
2019-04-06 23:48:58,157 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.query.weight
2019-04-06 23:48:58,157 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.value.bias
2019-04-06 23:48:58,157 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.value.weight
2019-04-06 23:48:58,157 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.intermediate.dense.bias
2019-04-06 23:48:58,157 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.intermediate.dense.weight
2019-04-06 23:48:58,157 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.output.LayerNorm.bias
2019-04-06 23:48:58,158 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.output.LayerNorm.weight
2019-04-06 23:48:58,158 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.output.dense.bias
2019-04-06 23:48:58,158 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.output.dense.weight
2019-04-06 23:48:58,158 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.pooler.dense.bias
2019-04-06 23:48:58,158 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.pooler.dense.weight
2019-04-06 23:48:58,158 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_token_characters._embedding._module.weight
2019-04-06 23:48:58,158 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_token_characters._encoder._module.conv_layer_0.bias
2019-04-06 23:48:58,158 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_token_characters._encoder._module.conv_layer_0.weight
2019-04-06 23:48:58,161 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.data_iterator.DataIterator'> from params {'batch_size': 64, 'sorting_keys': [['text', 'num_tokens']], 'type': 'bucket'} and extras set()
2019-04-06 23:48:58,162 - INFO - allennlp.common.params - iterator.type = bucket
2019-04-06 23:48:58,162 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.bucket_iterator.BucketIterator'> from params {'batch_size': 64, 'sorting_keys': [['text', 'num_tokens']]} and extras set()
2019-04-06 23:48:58,162 - INFO - allennlp.common.params - iterator.sorting_keys = [['text', 'num_tokens']]
2019-04-06 23:48:58,162 - INFO - allennlp.common.params - iterator.padding_noise = 0.1
2019-04-06 23:48:58,162 - INFO - allennlp.common.params - iterator.biggest_batch_first = False
2019-04-06 23:48:58,163 - INFO - allennlp.common.params - iterator.batch_size = 64
2019-04-06 23:48:58,163 - INFO - allennlp.common.params - iterator.instances_per_epoch = None
2019-04-06 23:48:58,163 - INFO - allennlp.common.params - iterator.max_instances_in_memory = None
2019-04-06 23:48:58,163 - INFO - allennlp.common.params - iterator.cache_instances = False
2019-04-06 23:48:58,163 - INFO - allennlp.common.params - iterator.track_epoch = False
2019-04-06 23:48:58,163 - INFO - allennlp.common.params - iterator.maximum_samples_per_batch = None
2019-04-06 23:48:58,163 - INFO - allennlp.common.params - validation_iterator = None
2019-04-06 23:48:58,164 - INFO - allennlp.common.params - trainer.no_grad = ()
2019-04-06 23:48:58,166 - INFO - allennlp.training.trainer - Following parameters are Frozen  (without gradient):
2019-04-06 23:48:58,167 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.embeddings.word_embeddings.weight
2019-04-06 23:48:58,167 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.embeddings.position_embeddings.weight
2019-04-06 23:48:58,167 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.embeddings.token_type_embeddings.weight
2019-04-06 23:48:58,167 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.embeddings.LayerNorm.weight
2019-04-06 23:48:58,167 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.embeddings.LayerNorm.bias
2019-04-06 23:48:58,167 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.query.weight
2019-04-06 23:48:58,167 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.query.bias
2019-04-06 23:48:58,167 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.key.weight
2019-04-06 23:48:58,167 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.key.bias
2019-04-06 23:48:58,168 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.value.weight
2019-04-06 23:48:58,168 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.value.bias
2019-04-06 23:48:58,168 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.output.dense.weight
2019-04-06 23:48:58,168 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.output.dense.bias
2019-04-06 23:48:58,168 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.output.LayerNorm.weight
2019-04-06 23:48:58,168 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.output.LayerNorm.bias
2019-04-06 23:48:58,168 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.intermediate.dense.weight
2019-04-06 23:48:58,168 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.intermediate.dense.bias
2019-04-06 23:48:58,168 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.output.dense.weight
2019-04-06 23:48:58,168 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.output.dense.bias
2019-04-06 23:48:58,169 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.output.LayerNorm.weight
2019-04-06 23:48:58,169 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.output.LayerNorm.bias
2019-04-06 23:48:58,169 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.query.weight
2019-04-06 23:48:58,169 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.query.bias
2019-04-06 23:48:58,169 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.key.weight
2019-04-06 23:48:58,169 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.key.bias
2019-04-06 23:48:58,169 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.value.weight
2019-04-06 23:48:58,169 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.value.bias
2019-04-06 23:48:58,169 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.output.dense.weight
2019-04-06 23:48:58,169 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.output.dense.bias
2019-04-06 23:48:58,169 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.output.LayerNorm.weight
2019-04-06 23:48:58,170 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.output.LayerNorm.bias
2019-04-06 23:48:58,170 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.intermediate.dense.weight
2019-04-06 23:48:58,170 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.intermediate.dense.bias
2019-04-06 23:48:58,170 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.output.dense.weight
2019-04-06 23:48:58,170 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.output.dense.bias
2019-04-06 23:48:58,170 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.output.LayerNorm.weight
2019-04-06 23:48:58,170 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.output.LayerNorm.bias
2019-04-06 23:48:58,170 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.query.weight
2019-04-06 23:48:58,170 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.query.bias
2019-04-06 23:48:58,170 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.key.weight
2019-04-06 23:48:58,171 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.key.bias
2019-04-06 23:48:58,171 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.value.weight
2019-04-06 23:48:58,171 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.value.bias
2019-04-06 23:48:58,171 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.output.dense.weight
2019-04-06 23:48:58,171 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.output.dense.bias
2019-04-06 23:48:58,171 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.output.LayerNorm.weight
2019-04-06 23:48:58,171 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.output.LayerNorm.bias
2019-04-06 23:48:58,171 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.intermediate.dense.weight
2019-04-06 23:48:58,171 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.intermediate.dense.bias
2019-04-06 23:48:58,171 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.output.dense.weight
2019-04-06 23:48:58,172 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.output.dense.bias
2019-04-06 23:48:58,172 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.output.LayerNorm.weight
2019-04-06 23:48:58,172 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.output.LayerNorm.bias
2019-04-06 23:48:58,172 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.query.weight
2019-04-06 23:48:58,172 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.query.bias
2019-04-06 23:48:58,172 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.key.weight
2019-04-06 23:48:58,172 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.key.bias
2019-04-06 23:48:58,172 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.value.weight
2019-04-06 23:48:58,172 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.value.bias
2019-04-06 23:48:58,172 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.output.dense.weight
2019-04-06 23:48:58,173 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.output.dense.bias
2019-04-06 23:48:58,173 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.output.LayerNorm.weight
2019-04-06 23:48:58,173 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.output.LayerNorm.bias
2019-04-06 23:48:58,173 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.intermediate.dense.weight
2019-04-06 23:48:58,173 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.intermediate.dense.bias
2019-04-06 23:48:58,173 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.output.dense.weight
2019-04-06 23:48:58,173 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.output.dense.bias
2019-04-06 23:48:58,173 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.output.LayerNorm.weight
2019-04-06 23:48:58,173 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.output.LayerNorm.bias
2019-04-06 23:48:58,173 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.query.weight
2019-04-06 23:48:58,174 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.query.bias
2019-04-06 23:48:58,174 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.key.weight
2019-04-06 23:48:58,174 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.key.bias
2019-04-06 23:48:58,174 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.value.weight
2019-04-06 23:48:58,174 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.value.bias
2019-04-06 23:48:58,174 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.output.dense.weight
2019-04-06 23:48:58,174 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.output.dense.bias
2019-04-06 23:48:58,174 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.output.LayerNorm.weight
2019-04-06 23:48:58,174 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.output.LayerNorm.bias
2019-04-06 23:48:58,174 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.intermediate.dense.weight
2019-04-06 23:48:58,175 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.intermediate.dense.bias
2019-04-06 23:48:58,175 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.output.dense.weight
2019-04-06 23:48:58,175 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.output.dense.bias
2019-04-06 23:48:58,175 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.output.LayerNorm.weight
2019-04-06 23:48:58,175 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.output.LayerNorm.bias
2019-04-06 23:48:58,175 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.query.weight
2019-04-06 23:48:58,175 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.query.bias
2019-04-06 23:48:58,175 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.key.weight
2019-04-06 23:48:58,175 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.key.bias
2019-04-06 23:48:58,175 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.value.weight
2019-04-06 23:48:58,176 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.value.bias
2019-04-06 23:48:58,176 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.output.dense.weight
2019-04-06 23:48:58,176 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.output.dense.bias
2019-04-06 23:48:58,176 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.output.LayerNorm.weight
2019-04-06 23:48:58,176 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.output.LayerNorm.bias
2019-04-06 23:48:58,176 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.intermediate.dense.weight
2019-04-06 23:48:58,176 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.intermediate.dense.bias
2019-04-06 23:48:58,176 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.output.dense.weight
2019-04-06 23:48:58,176 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.output.dense.bias
2019-04-06 23:48:58,176 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.output.LayerNorm.weight
2019-04-06 23:48:58,176 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.output.LayerNorm.bias
2019-04-06 23:48:58,177 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.query.weight
2019-04-06 23:48:58,177 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.query.bias
2019-04-06 23:48:58,177 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.key.weight
2019-04-06 23:48:58,177 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.key.bias
2019-04-06 23:48:58,177 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.value.weight
2019-04-06 23:48:58,177 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.value.bias
2019-04-06 23:48:58,177 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.output.dense.weight
2019-04-06 23:48:58,177 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.output.dense.bias
2019-04-06 23:48:58,177 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.output.LayerNorm.weight
2019-04-06 23:48:58,177 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.output.LayerNorm.bias
2019-04-06 23:48:58,178 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.intermediate.dense.weight
2019-04-06 23:48:58,178 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.intermediate.dense.bias
2019-04-06 23:48:58,178 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.output.dense.weight
2019-04-06 23:48:58,178 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.output.dense.bias
2019-04-06 23:48:58,178 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.output.LayerNorm.weight
2019-04-06 23:48:58,178 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.output.LayerNorm.bias
2019-04-06 23:48:58,178 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.query.weight
2019-04-06 23:48:58,178 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.query.bias
2019-04-06 23:48:58,178 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.key.weight
2019-04-06 23:48:58,178 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.key.bias
2019-04-06 23:48:58,179 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.value.weight
2019-04-06 23:48:58,179 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.value.bias
2019-04-06 23:48:58,179 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.output.dense.weight
2019-04-06 23:48:58,179 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.output.dense.bias
2019-04-06 23:48:58,179 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.output.LayerNorm.weight
2019-04-06 23:48:58,179 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.output.LayerNorm.bias
2019-04-06 23:48:58,179 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.intermediate.dense.weight
2019-04-06 23:48:58,179 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.intermediate.dense.bias
2019-04-06 23:48:58,179 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.output.dense.weight
2019-04-06 23:48:58,179 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.output.dense.bias
2019-04-06 23:48:58,180 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.output.LayerNorm.weight
2019-04-06 23:48:58,180 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.output.LayerNorm.bias
2019-04-06 23:48:58,180 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.query.weight
2019-04-06 23:48:58,180 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.query.bias
2019-04-06 23:48:58,180 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.key.weight
2019-04-06 23:48:58,180 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.key.bias
2019-04-06 23:48:58,180 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.value.weight
2019-04-06 23:48:58,180 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.value.bias
2019-04-06 23:48:58,180 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.output.dense.weight
2019-04-06 23:48:58,180 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.output.dense.bias
2019-04-06 23:48:58,181 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.output.LayerNorm.weight
2019-04-06 23:48:58,181 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.output.LayerNorm.bias
2019-04-06 23:48:58,181 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.intermediate.dense.weight
2019-04-06 23:48:58,181 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.intermediate.dense.bias
2019-04-06 23:48:58,181 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.output.dense.weight
2019-04-06 23:48:58,181 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.output.dense.bias
2019-04-06 23:48:58,181 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.output.LayerNorm.weight
2019-04-06 23:48:58,181 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.output.LayerNorm.bias
2019-04-06 23:48:58,181 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.query.weight
2019-04-06 23:48:58,181 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.query.bias
2019-04-06 23:48:58,182 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.key.weight
2019-04-06 23:48:58,182 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.key.bias
2019-04-06 23:48:58,182 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.value.weight
2019-04-06 23:48:58,182 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.value.bias
2019-04-06 23:48:58,182 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.output.dense.weight
2019-04-06 23:48:58,182 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.output.dense.bias
2019-04-06 23:48:58,182 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.output.LayerNorm.weight
2019-04-06 23:48:58,182 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.output.LayerNorm.bias
2019-04-06 23:48:58,182 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.intermediate.dense.weight
2019-04-06 23:48:58,182 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.intermediate.dense.bias
2019-04-06 23:48:58,183 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.output.dense.weight
2019-04-06 23:48:58,183 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.output.dense.bias
2019-04-06 23:48:58,183 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.output.LayerNorm.weight
2019-04-06 23:48:58,183 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.output.LayerNorm.bias
2019-04-06 23:48:58,183 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.query.weight
2019-04-06 23:48:58,183 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.query.bias
2019-04-06 23:48:58,183 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.key.weight
2019-04-06 23:48:58,183 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.key.bias
2019-04-06 23:48:58,183 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.value.weight
2019-04-06 23:48:58,183 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.value.bias
2019-04-06 23:48:58,184 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.output.dense.weight
2019-04-06 23:48:58,184 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.output.dense.bias
2019-04-06 23:48:58,184 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.output.LayerNorm.weight
2019-04-06 23:48:58,184 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.output.LayerNorm.bias
2019-04-06 23:48:58,184 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.intermediate.dense.weight
2019-04-06 23:48:58,184 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.intermediate.dense.bias
2019-04-06 23:48:58,184 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.output.dense.weight
2019-04-06 23:48:58,184 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.output.dense.bias
2019-04-06 23:48:58,184 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.output.LayerNorm.weight
2019-04-06 23:48:58,184 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.output.LayerNorm.bias
2019-04-06 23:48:58,185 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.query.weight
2019-04-06 23:48:58,185 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.query.bias
2019-04-06 23:48:58,185 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.key.weight
2019-04-06 23:48:58,185 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.key.bias
2019-04-06 23:48:58,185 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.value.weight
2019-04-06 23:48:58,185 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.value.bias
2019-04-06 23:48:58,185 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.output.dense.weight
2019-04-06 23:48:58,185 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.output.dense.bias
2019-04-06 23:48:58,185 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.output.LayerNorm.weight
2019-04-06 23:48:58,185 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.output.LayerNorm.bias
2019-04-06 23:48:58,185 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.intermediate.dense.weight
2019-04-06 23:48:58,186 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.intermediate.dense.bias
2019-04-06 23:48:58,186 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.output.dense.weight
2019-04-06 23:48:58,186 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.output.dense.bias
2019-04-06 23:48:58,186 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.output.LayerNorm.weight
2019-04-06 23:48:58,186 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.output.LayerNorm.bias
2019-04-06 23:48:58,186 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.pooler.dense.weight
2019-04-06 23:48:58,186 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert.bert_model.pooler.dense.bias
2019-04-06 23:48:58,186 - INFO - allennlp.training.trainer - crf._constraint_mask
2019-04-06 23:48:58,186 - INFO - allennlp.training.trainer - Following parameters are Tunable (with gradient):
2019-04-06 23:48:58,186 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert._scalar_mix.gamma
2019-04-06 23:48:58,187 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.0
2019-04-06 23:48:58,187 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.1
2019-04-06 23:48:58,187 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.2
2019-04-06 23:48:58,187 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.3
2019-04-06 23:48:58,187 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.4
2019-04-06 23:48:58,187 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.5
2019-04-06 23:48:58,187 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.6
2019-04-06 23:48:58,187 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.7
2019-04-06 23:48:58,187 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.8
2019-04-06 23:48:58,187 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.9
2019-04-06 23:48:58,188 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.10
2019-04-06 23:48:58,188 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_bert._scalar_mix.scalar_parameters.11
2019-04-06 23:48:58,188 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_token_characters._embedding._module.weight
2019-04-06 23:48:58,188 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_token_characters._encoder._module.conv_layer_0.weight
2019-04-06 23:48:58,188 - INFO - allennlp.training.trainer - text_field_embedder.token_embedder_token_characters._encoder._module.conv_layer_0.bias
2019-04-06 23:48:58,188 - INFO - allennlp.training.trainer - encoder._module.weight_ih_l0
2019-04-06 23:48:58,188 - INFO - allennlp.training.trainer - encoder._module.weight_hh_l0
2019-04-06 23:48:58,188 - INFO - allennlp.training.trainer - encoder._module.bias_ih_l0
2019-04-06 23:48:58,188 - INFO - allennlp.training.trainer - encoder._module.bias_hh_l0
2019-04-06 23:48:58,188 - INFO - allennlp.training.trainer - encoder._module.weight_ih_l0_reverse
2019-04-06 23:48:58,189 - INFO - allennlp.training.trainer - encoder._module.weight_hh_l0_reverse
2019-04-06 23:48:58,189 - INFO - allennlp.training.trainer - encoder._module.bias_ih_l0_reverse
2019-04-06 23:48:58,189 - INFO - allennlp.training.trainer - encoder._module.bias_hh_l0_reverse
2019-04-06 23:48:58,189 - INFO - allennlp.training.trainer - encoder._module.weight_ih_l1
2019-04-06 23:48:58,189 - INFO - allennlp.training.trainer - encoder._module.weight_hh_l1
2019-04-06 23:48:58,189 - INFO - allennlp.training.trainer - encoder._module.bias_ih_l1
2019-04-06 23:48:58,189 - INFO - allennlp.training.trainer - encoder._module.bias_hh_l1
2019-04-06 23:48:58,189 - INFO - allennlp.training.trainer - encoder._module.weight_ih_l1_reverse
2019-04-06 23:48:58,189 - INFO - allennlp.training.trainer - encoder._module.weight_hh_l1_reverse
2019-04-06 23:48:58,189 - INFO - allennlp.training.trainer - encoder._module.bias_ih_l1_reverse
2019-04-06 23:48:58,190 - INFO - allennlp.training.trainer - encoder._module.bias_hh_l1_reverse
2019-04-06 23:48:58,190 - INFO - allennlp.training.trainer - tag_projection_layer._module.weight
2019-04-06 23:48:58,190 - INFO - allennlp.training.trainer - tag_projection_layer._module.bias
2019-04-06 23:48:58,190 - INFO - allennlp.training.trainer - crf.transitions
2019-04-06 23:48:58,190 - INFO - allennlp.common.params - trainer.patience = 15
2019-04-06 23:48:58,190 - INFO - allennlp.common.params - trainer.validation_metric = +f1-measure-overall
2019-04-06 23:48:58,190 - INFO - allennlp.common.params - trainer.shuffle = True
2019-04-06 23:48:58,190 - INFO - allennlp.common.params - trainer.num_epochs = 75
2019-04-06 23:48:58,190 - INFO - allennlp.common.params - trainer.cuda_device = 0
2019-04-06 23:48:58,190 - INFO - allennlp.common.params - trainer.grad_norm = 5
2019-04-06 23:48:58,191 - INFO - allennlp.common.params - trainer.grad_clipping = None
2019-04-06 23:48:58,191 - INFO - allennlp.common.params - trainer.learning_rate_scheduler = None
2019-04-06 23:48:58,191 - INFO - allennlp.common.params - trainer.momentum_scheduler = None
2019-04-06 23:49:02,117 - INFO - allennlp.common.params - trainer.optimizer.type = adam
2019-04-06 23:49:02,117 - INFO - allennlp.common.params - trainer.optimizer.parameter_groups = None
2019-04-06 23:49:02,117 - INFO - allennlp.training.optimizers - Number of trainable parameters: 2729707
2019-04-06 23:49:02,117 - INFO - allennlp.common.params - trainer.optimizer.infer_type_and_cast = True
2019-04-06 23:49:02,117 - INFO - allennlp.common.params - Converting Params object to dict; logging of default values will not occur when dictionary parameters are used subsequently.
2019-04-06 23:49:02,118 - INFO - allennlp.common.params - CURRENTLY DEFINED PARAMETERS: 
2019-04-06 23:49:02,118 - INFO - allennlp.common.params - trainer.optimizer.lr = 0.001
2019-04-06 23:49:02,118 - INFO - allennlp.common.params - trainer.num_serialized_models_to_keep = 3
2019-04-06 23:49:02,118 - INFO - allennlp.common.params - trainer.keep_serialized_model_every_num_seconds = None
2019-04-06 23:49:02,118 - INFO - allennlp.common.params - trainer.model_save_interval = None
2019-04-06 23:49:02,118 - INFO - allennlp.common.params - trainer.summary_interval = 100
2019-04-06 23:49:02,118 - INFO - allennlp.common.params - trainer.histogram_interval = None
2019-04-06 23:49:02,118 - INFO - allennlp.common.params - trainer.should_log_parameter_statistics = True
2019-04-06 23:49:02,119 - INFO - allennlp.common.params - trainer.should_log_learning_rate = False
2019-04-06 23:49:02,119 - INFO - allennlp.common.params - trainer.log_batch_size_period = None
2019-04-06 23:49:02,178 - INFO - allennlp.training.trainer - Beginning training.
2019-04-06 23:49:02,178 - INFO - allennlp.training.trainer - Epoch 0/74
2019-04-06 23:49:02,178 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 3314.252
2019-04-06 23:49:02,270 - INFO - allennlp.training.trainer - GPU 0 memory usage MB: 1338
2019-04-06 23:49:02,270 - INFO - allennlp.training.trainer - GPU 1 memory usage MB: 8439
2019-04-06 23:49:02,274 - INFO - allennlp.training.trainer - Training
  0%|          | 0/85 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/home/khw/.conda/envs/allen/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/khw/.conda/envs/allen/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/khw/.conda/envs/allen/lib/python3.6/site-packages/allennlp/run.py", line 21, in <module>
    run()
  File "/home/khw/.conda/envs/allen/lib/python3.6/site-packages/allennlp/run.py", line 18, in run
    main(prog="allennlp")
  File "/home/khw/.conda/envs/allen/lib/python3.6/site-packages/allennlp/commands/__init__.py", line 101, in main
    args.func(args)
  File "/home/khw/.conda/envs/allen/lib/python3.6/site-packages/allennlp/commands/train.py", line 103, in train_model_from_args
    args.force)
  File "/home/khw/.conda/envs/allen/lib/python3.6/site-packages/allennlp/commands/train.py", line 136, in train_model_from_file
    return train_model(params, serialization_dir, file_friendly_logging, recover, force)
  File "/home/khw/.conda/envs/allen/lib/python3.6/site-packages/allennlp/commands/train.py", line 204, in train_model
    metrics = trainer.train()
  File "/home/khw/.conda/envs/allen/lib/python3.6/site-packages/allennlp/training/trainer.py", line 480, in train
    train_metrics = self._train_epoch(epoch)
  File "/home/khw/.conda/envs/allen/lib/python3.6/site-packages/allennlp/training/trainer.py", line 315, in _train_epoch
    for batch_group in train_generator_tqdm:
  File "/home/khw/.conda/envs/allen/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1022, in __iter__
    for obj in iterable:
  File "/home/khw/.conda/envs/allen/lib/python3.6/site-packages/allennlp/common/util.py", line 105, in <lambda>
    return iter(lambda: list(islice(iterator, 0, group_size)), [])
  File "/home/khw/.conda/envs/allen/lib/python3.6/site-packages/allennlp/data/iterators/data_iterator.py", line 144, in __call__
    for batch in batches:
  File "/home/khw/.conda/envs/allen/lib/python3.6/site-packages/allennlp/data/iterators/bucket_iterator.py", line 117, in _create_batches
    self._padding_noise)
  File "/home/khw/.conda/envs/allen/lib/python3.6/site-packages/allennlp/data/iterators/bucket_iterator.py", line 37, in sort_by_padding
    for (field_name, padding_key) in sorting_keys],
  File "/home/khw/.conda/envs/allen/lib/python3.6/site-packages/allennlp/data/iterators/bucket_iterator.py", line 37, in <listcomp>
    for (field_name, padding_key) in sorting_keys],
KeyError: 'text'
haowenke commented 5 years ago

It seems that the code works fine on parsing and text classification while same errror occurs (KeyError: 'text') in ner and pico task.

ibeltagy commented 5 years ago

Just pushed a fix. Please let us know if you have other issues. Thanks

purn3ndu commented 5 years ago

Thanks, @ibeltagy. Was stuck at the same problem. The fix solved it. Although I ran into another issue.

2019-04-08 18:49:05,620 - INFO - allennlp.training.trainer - Beginning training.
2019-04-08 18:49:05,620 - INFO - allennlp.training.trainer - Epoch 0/74
2019-04-08 18:49:05,620 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 3521.184
2019-04-08 18:49:05,781 - INFO - allennlp.training.trainer - GPU 0 memory usage MB: 2018
2019-04-08 18:49:05,781 - INFO - allennlp.training.trainer - GPU 1 memory usage MB: 1
2019-04-08 18:49:05,781 - INFO - allennlp.training.trainer - GPU 2 memory usage MB: 1
2019-04-08 18:49:05,781 - INFO - allennlp.training.trainer - GPU 3 memory usage MB: 1
2019-04-08 18:49:05,785 - INFO - allennlp.training.trainer - Training
  0%|          | 0/62 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.6/site-packages/allennlp/run.py", line 21, in <module>
    run()
  File "/opt/conda/lib/python3.6/site-packages/allennlp/run.py", line 18, in run
    main(prog="allennlp")
  File "/opt/conda/lib/python3.6/site-packages/allennlp/commands/__init__.py", line 101, in main
    args.func(args)
  File "/opt/conda/lib/python3.6/site-packages/allennlp/commands/train.py", line 103, in train_model_from_args
    args.force)
  File "/opt/conda/lib/python3.6/site-packages/allennlp/commands/train.py", line 136, in train_model_from_file
    return train_model(params, serialization_dir, file_friendly_logging, recover, force)
  File "/opt/conda/lib/python3.6/site-packages/allennlp/commands/train.py", line 204, in train_model
    metrics = trainer.train()
  File "/opt/conda/lib/python3.6/site-packages/allennlp/training/trainer.py", line 480, in train
    train_metrics = self._train_epoch(epoch)
  File "/opt/conda/lib/python3.6/site-packages/allennlp/training/trainer.py", line 322, in _train_epoch
    loss = self.batch_loss(batch_group, for_training=True)
  File "/opt/conda/lib/python3.6/site-packages/allennlp/training/trainer.py", line 263, in batch_loss
    output_dict = self.model(**batch)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 494, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/allennlp/models/crf_tagger.py", line 182, in forward
    embedded_text_input = self.text_field_embedder(tokens)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 494, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/allennlp/modules/text_field_embedders/basic_text_field_embedder.py", line 110, in forward
    token_vectors = embedder(*tensors)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 494, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/allennlp/modules/token_embedders/bert_token_embedder.py", line 91, in forward
    attention_mask=util.combine_initial_dims(input_mask))
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 494, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/pytorch_pretrained_bert/modeling.py", line 711, in forward
    embedding_output = self.embeddings(input_ids, token_type_ids)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 494, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/pytorch_pretrained_bert/modeling.py", line 262, in forward
    embeddings = self.LayerNorm(embeddings)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 494, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/normalization/fused_layer_norm.py", line 149, in forward
    input, self.weight, self.bias)
  File "/opt/conda/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/normalization/fused_layer_norm.py", line 21, in forward
    input_, self.normalized_shape, weight_, bias_, self.eps)
RuntimeError: Undefined backend is not a valid device type (backendToDeviceType at /opt/conda/lib/python3.6/site-packages/torch/include/c10/core/Backend.h:141)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f446433c505 in /opt/conda/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x983b (0x7f44171a183b in /opt/conda/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/fused_layer_norm_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #2: layer_norm_affine(at::Tensor, c10::ArrayRef<long>, at::Tensor, at::Tensor, double) + 0x58 (0x7f44171a43e8 in /opt/conda/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/fused_layer_norm_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #3: <unknown function> + 0x19583 (0x7f44171b1583 in /opt/conda/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/fused_layer_norm_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #4: <unknown function> + 0x1969e (0x7f44171b169e in /opt/conda/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/fused_layer_norm_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #5: <unknown function> + 0x168f2 (0x7f44171ae8f2 in /opt/conda/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/fused_layer_norm_cuda.cpython-36m-x86_64-linux-gnu.so)
<omitting python frames>
frame #13: THPFunction_do_forward(THPFunction*, _object*) + 0x189 (0x7f448e54d129 in /opt/conda/lib/python3.6/site-packages/torch/lib/libtorch_python.so)

Any idea on how to fix this?

purn3ndu commented 5 years ago

Nevermind, the fix mentioned here fixed it. :) Thanks again.