update AllenNLP for pytorch 1.0

The main new warning is:
UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than tensor.new_tensor(sourceTensor).
And here are the failures:
allennlp/tests/nn/util_test.py ............................FF...F.... 
allennlp/tests/models/biaffine_dependency_parser_test.py ...F..
allennlp/tests/models/atis_semantic_parser_test.py .F
allennlp/tests/training/optimizer_test.py ..F

=============================================================================================================================== FAILURES ================================================================================================================================
____________________________________________________________________________________________________ AtisSemanticParserTest.test_atis_model_can_train_save_and_load _____________________________________________________________________________________________________

self = <allennlp.tests.models.atis_semantic_parser_test.AtisSemanticParserTest testMethod=test_atis_model_can_train_save_and_load>

    @flaky
    def test_atis_model_can_train_save_and_load(self):
>       self.ensure_model_can_train_save_and_load(self.param_file)

allennlp/tests/models/atis_semantic_parser_test.py:15: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
allennlp/common/testing/model_test_case.py:106: in ensure_model_can_train_save_and_load
    self.check_model_computes_gradients_correctly(model, model_batch, gradients_to_ignore)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

model = AtisSemanticParser(
  (_utterance_embedder): BasicTextFieldEmbedder(
    (token_embedder_tokens): Embedding()
  )
  (_...s=160, out_features=10, bias=True)
    (_decoder_cell): LSTM(80, 80, num_layers=2)
    (_dropout): Dropout(p=0.5)
  )
)
model_batch = {'actions': [[ProductionRule(rule='agg -> [agg_func, "(", col, ")"]', is_global_rule=True, rule_id=tensor([0]), nonter...    [[1174],
         [1103],
         [ 769],
         ...,
         [ 302],
         [ 335],
         [ 432]]]), ...}
params_to_ignore = None

    @staticmethod
    def check_model_computes_gradients_correctly(model: Model,
                                                 model_batch: Dict[str, Union[Any, Dict[str, Any]]],
                                                 params_to_ignore: Set[str] = None):
        print("Checking gradients")
        model.zero_grad()
        result = model(**model_batch)
        result["loss"].backward()
        has_zero_or_none_grads = {}
        for name, parameter in model.named_parameters():
            zeros = torch.zeros(parameter.size())
            if params_to_ignore and name in params_to_ignore:
                continue
            if parameter.requires_grad:

                if parameter.grad is None:
                    has_zero_or_none_grads[name] = "No gradient computed (i.e parameter.grad is None)"

                elif parameter.grad.is_sparse or parameter.grad.data.is_sparse:
                    pass

                # Some parameters will only be partially updated,
                # like embeddings, so we just check that any gradient is non-zero.
                elif (parameter.grad.cpu() == zeros).all():
                    has_zero_or_none_grads[name] = f"zeros with shape ({tuple(parameter.grad.size())})"
            else:
                assert parameter.grad is None

        if has_zero_or_none_grads:
            for name, grad in has_zero_or_none_grads.items():
                print(f"Parameter: {name} had incorrect gradient: {grad}")
>           raise Exception("Incorrect gradients found. See stdout for more info.")
E           Exception: Incorrect gradients found. See stdout for more info.

allennlp/common/testing/model_test_case.py:198: Exception
------------------------------------------------------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------------------------------------------------------
Checking gradients
Parameter: _entity_type_decoder_embedding.weight had incorrect gradient: No gradient computed (i.e parameter.grad is None)
------------------------------------------------------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------------------------------------------------------
13it [00:00, 26.42it/s]
100%|██████████| 13/13 [00:00<00:00, 2905.11it/s]
0it [00:00, ?it/s]
13it [00:00, 25.48it/s]

0it [00:00, ?it/s]
13it [00:00, 38.07it/s]

0it [00:00, ?it/s]
26it [00:00, 3036.30it/s]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 210.5952 ||: 100%|##########| 4/4 [00:03<00:00,  1.04it/s]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 156.3982 ||: 100%|##########| 4/4 [00:01<00:00,  2.31it/s]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 160.4649 ||: 100%|##########| 4/4 [00:03<00:00,  1.06it/s]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 164.9378 ||: 100%|##########| 4/4 [00:01<00:00,  2.30it/s]

0it [00:00, ?it/s]
13it [00:00, 25.30it/s]

0it [00:00, ?it/s]
13it [00:00, 25.96it/s]

--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:00:53 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:00:53 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'type': 'atis'} and extras {}
23:00:53 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db'} and extras {}
23:00:53 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : /home/michael/allennlp/allennlp/tests/fixtures/data/atis/sample.json
23:00:54 - INFO - allennlp.data.vocabulary - Fitting token dictionary from dataset.
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'type': 'atis'} and extras {}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db'} and extras {}
23:00:54 - INFO - allennlp.commands.train - Using a separate dataset reader to load validation and test data.
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'keep_if_unparseable': True, 'type': 'atis'} and extras {}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'keep_if_unparseable': True} and extras {}
23:00:54 - INFO - allennlp.commands.train - Reading training data from allennlp/tests/fixtures/data/atis/sample.json
23:00:54 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:00:54 - INFO - allennlp.commands.train - Reading validation data from allennlp/tests/fixtures/data/atis/sample.json
23:00:54 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:00:55 - INFO - allennlp.commands.train - From dataset instances, train, validation will be considered for vocabulary creation.
23:00:55 - INFO - allennlp.data.vocabulary - Fitting token dictionary from dataset.
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.data_iterator.DataIterator'> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.basic_iterator.BasicIterator'> from params {'batch_size': 4} and extras {}
23:00:55 - INFO - allennlp.commands.train - Following parameters are Frozen  (without gradient):
23:00:55 - INFO - allennlp.commands.train - Following parameters are Tunable (with gradient):
23:00:55 - INFO - allennlp.commands.train - _first_action_embedding
23:00:55 - INFO - allennlp.commands.train - _first_attended_utterance
23:00:55 - INFO - allennlp.commands.train - _utterance_embedder.token_embedder_tokens.weight
23:00:55 - INFO - allennlp.commands.train - _encoder._module.weight_ih_l0
23:00:55 - INFO - allennlp.commands.train - _encoder._module.weight_hh_l0
23:00:55 - INFO - allennlp.commands.train - _encoder._module.bias_ih_l0
23:00:55 - INFO - allennlp.commands.train - _encoder._module.bias_hh_l0
23:00:55 - INFO - allennlp.commands.train - _encoder._module.weight_ih_l0_reverse
23:00:55 - INFO - allennlp.commands.train - _encoder._module.weight_hh_l0_reverse
23:00:55 - INFO - allennlp.commands.train - _encoder._module.bias_ih_l0_reverse
23:00:55 - INFO - allennlp.commands.train - _encoder._module.bias_hh_l0_reverse
23:00:55 - INFO - allennlp.commands.train - _action_embedder.weight
23:00:55 - INFO - allennlp.commands.train - _output_action_embedder.weight
23:00:55 - INFO - allennlp.commands.train - _entity_type_decoder_embedding.weight
23:00:55 - INFO - allennlp.commands.train - _transition_function._input_projection_layer.weight
23:00:55 - INFO - allennlp.commands.train - _transition_function._input_projection_layer.bias
23:00:55 - INFO - allennlp.commands.train - _transition_function._output_projection_layer.weight
23:00:55 - INFO - allennlp.commands.train - _transition_function._output_projection_layer.bias
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_ih_l0
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_hh_l0
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_ih_l0
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_hh_l0
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_ih_l1
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_hh_l1
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_ih_l1
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_hh_l1
23:00:55 - DEBUG - allennlp.common.registrable - instantiating registered subclass default of <class 'allennlp.training.trainer.Trainer'>
23:00:55 - INFO - allennlp.training.optimizers - Number of trainable parameters: 139600
23:00:55 - DEBUG - allennlp.common.registrable - instantiating registered subclass sgd of <class 'allennlp.training.optimizers.Optimizer'>
23:00:55 - INFO - allennlp.training.trainer - Beginning training.
23:00:55 - INFO - allennlp.training.trainer - Epoch 0/1
23:00:55 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 601.524
23:00:55 - INFO - allennlp.training.trainer - Training
23:00:55 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 287}}
23:00:55 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:00:56 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 110}}
23:00:56 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:00:57 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:00:57 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:00:58 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:00:58 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:00:59 - INFO - allennlp.training.trainer - Validating
23:00:59 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:00:59 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:00:59 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:00:59 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:00 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:00 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:00 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:00 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:01:01 - INFO - allennlp.training.trainer -                       Training |  Validation
23:01:01 - INFO - allennlp.training.trainer - denotation_acc    |     0.000  |     0.000
23:01:01 - INFO - allennlp.training.trainer - valid_sql_query   |     0.000  |     0.000
23:01:01 - INFO - allennlp.training.trainer - loss              |   210.595  |   156.398
23:01:01 - INFO - allennlp.training.trainer - action_similarity |     0.000  |     0.000
23:01:01 - INFO - allennlp.training.trainer - exact_match       |     0.000  |     0.000
23:01:01 - INFO - allennlp.training.trainer - Best validation performance so far. Copying weights to '/tmp/allennlp_testsjckp_bka/save_and_load_test/best.th'.
23:01:01 - INFO - allennlp.training.trainer - Epoch duration: 00:00:05
23:01:01 - INFO - allennlp.training.trainer - Estimated training time remaining: 0:00:05
23:01:01 - INFO - allennlp.training.trainer - Epoch 1/1
23:01:01 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 638.16
23:01:01 - INFO - allennlp.training.trainer - Training
23:01:01 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 202}}
23:01:01 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:02 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 287}}
23:01:02 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:03 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:03 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:04 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:04 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:01:04 - INFO - allennlp.training.trainer - Validating
23:01:04 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:04 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:05 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:05 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:06 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:06 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:06 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:06 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:01:06 - INFO - allennlp.training.trainer -                       Training |  Validation
23:01:06 - INFO - allennlp.training.trainer - denotation_acc    |     0.000  |     0.000
23:01:06 - INFO - allennlp.training.trainer - valid_sql_query   |     0.000  |     0.000
23:01:06 - INFO - allennlp.training.trainer - loss              |   160.465  |   164.938
23:01:06 - INFO - allennlp.training.trainer - action_similarity |     0.000  |     0.000
23:01:06 - INFO - allennlp.training.trainer - exact_match       |     0.000  |     0.000
23:01:06 - INFO - allennlp.training.trainer - Epoch duration: 00:00:05
23:01:06 - INFO - allennlp.models.archival - archiving weights and vocabulary to /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz
23:01:06 - INFO - allennlp.commands.train - Loading the best epoch weights.
23:01:06 - INFO - allennlp.common.util - Metrics: {
  "training_duration": "00:00:11",
  "training_start_epoch": 0,
  "training_epochs": 1,
  "epoch": 1,
  "training_exact_match": 0,
  "training_denotation_acc": 0,
  "training_valid_sql_query": 0,
  "training_action_similarity": 0,
  "training_loss": 160.4649143218994,
  "validation_exact_match": 0.0,
  "validation_denotation_acc": 0.0,
  "validation_valid_sql_query": 0.0,
  "validation_action_similarity": 0.0,
  "validation_loss": 164.93781661987305,
  "best_epoch": 0,
  "best_validation_exact_match": 0.0,
  "best_validation_denotation_acc": 0.0,
  "best_validation_valid_sql_query": 0.0,
  "best_validation_action_similarity": 0.0,
  "best_validation_loss": 156.3982391357422
}
23:01:06 - INFO - allennlp.models.archival - loading archive file /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz from cache at /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz
23:01:06 - INFO - allennlp.models.archival - extracting archive file /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz to temp dir /tmp/tmp2o3wao_0
23:01:06 - DEBUG - allennlp.common.registrable - instantiating registered subclass atis_parser of <class 'allennlp.models.model.Model'>
23:01:06 - DEBUG - allennlp.common.registrable - instantiating registered subclass default of <class 'allennlp.data.vocabulary.Vocabulary'>
23:01:06 - INFO - allennlp.data.vocabulary - Loading token dictionary from /tmp/tmp2o3wao_0/vocabulary.
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'type': 'atis'} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db'} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.data_iterator.DataIterator'> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.basic_iterator.BasicIterator'> from params {'batch_size': 4} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.data_iterator.DataIterator'> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.basic_iterator.BasicIterator'> from params {'batch_size': 4} and extras {}
23:01:07 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:07 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:07 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:07 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:08 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:08 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
------------------------------------------------------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------------------------------------------------------
Checking gradients
Parameter: _entity_type_decoder_embedding.weight had incorrect gradient: No gradient computed (i.e parameter.grad is None)
------------------------------------------------------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------------------------------------------------------
13it [00:00, 36.74it/s]
100%|██████████| 13/13 [00:00<00:00, 2902.32it/s]
0it [00:00, ?it/s]
13it [00:00, 23.84it/s]

0it [00:00, ?it/s]
13it [00:00, 37.47it/s]

0it [00:00, ?it/s]
26it [00:00, 2951.98it/s]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 210.5952 ||: 100%|##########| 4/4 [00:03<00:00,  1.07it/s]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 156.3982 ||: 100%|##########| 4/4 [00:01<00:00,  2.09it/s]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 160.4649 ||: 100%|##########| 4/4 [00:04<00:00,  1.04s/it]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 164.9378 ||: 100%|##########| 4/4 [00:01<00:00,  2.09it/s]

0it [00:00, ?it/s]
13it [00:00, 39.00it/s]

0it [00:00, ?it/s]
13it [00:00, 24.27it/s]

--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:01:09 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:01:09 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'type': 'atis'} and extras {}
23:01:09 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db'} and extras {}
23:01:09 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : /home/michael/allennlp/allennlp/tests/fixtures/data/atis/sample.json
23:01:10 - INFO - allennlp.data.vocabulary - Fitting token dictionary from dataset.
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'type': 'atis'} and extras {}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db'} and extras {}
23:01:10 - INFO - allennlp.commands.train - Using a separate dataset reader to load validation and test data.
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'keep_if_unparseable': True, 'type': 'atis'} and extras {}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'keep_if_unparseable': True} and extras {}
23:01:10 - INFO - allennlp.commands.train - Reading training data from allennlp/tests/fixtures/data/atis/sample.json
23:01:10 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:10 - INFO - allennlp.commands.train - Reading validation data from allennlp/tests/fixtures/data/atis/sample.json
23:01:11 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:11 - INFO - allennlp.commands.train - From dataset instances, train, validation will be considered for vocabulary creation.
23:01:11 - INFO - allennlp.data.vocabulary - Fitting token dictionary from dataset.
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.data_iterator.DataIterator'> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.basic_iterator.BasicIterator'> from params {'batch_size': 4} and extras {}
23:01:11 - INFO - allennlp.commands.train - Following parameters are Frozen  (without gradient):
23:01:11 - INFO - allennlp.commands.train - Following parameters are Tunable (with gradient):
23:01:11 - INFO - allennlp.commands.train - _first_action_embedding
23:01:11 - INFO - allennlp.commands.train - _first_attended_utterance
23:01:11 - INFO - allennlp.commands.train - _utterance_embedder.token_embedder_tokens.weight
23:01:11 - INFO - allennlp.commands.train - _encoder._module.weight_ih_l0
23:01:11 - INFO - allennlp.commands.train - _encoder._module.weight_hh_l0
23:01:11 - INFO - allennlp.commands.train - _encoder._module.bias_ih_l0
23:01:11 - INFO - allennlp.commands.train - _encoder._module.bias_hh_l0
23:01:11 - INFO - allennlp.commands.train - _encoder._module.weight_ih_l0_reverse
23:01:11 - INFO - allennlp.commands.train - _encoder._module.weight_hh_l0_reverse
23:01:11 - INFO - allennlp.commands.train - _encoder._module.bias_ih_l0_reverse
23:01:11 - INFO - allennlp.commands.train - _encoder._module.bias_hh_l0_reverse
23:01:11 - INFO - allennlp.commands.train - _action_embedder.weight
23:01:11 - INFO - allennlp.commands.train - _output_action_embedder.weight
23:01:11 - INFO - allennlp.commands.train - _entity_type_decoder_embedding.weight
23:01:11 - INFO - allennlp.commands.train - _transition_function._input_projection_layer.weight
23:01:11 - INFO - allennlp.commands.train - _transition_function._input_projection_layer.bias
23:01:11 - INFO - allennlp.commands.train - _transition_function._output_projection_layer.weight
23:01:11 - INFO - allennlp.commands.train - _transition_function._output_projection_layer.bias
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_ih_l0
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_hh_l0
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_ih_l0
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_hh_l0
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_ih_l1
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_hh_l1
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_ih_l1
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_hh_l1
23:01:11 - DEBUG - allennlp.common.registrable - instantiating registered subclass default of <class 'allennlp.training.trainer.Trainer'>
23:01:11 - INFO - allennlp.training.optimizers - Number of trainable parameters: 139600
23:01:11 - DEBUG - allennlp.common.registrable - instantiating registered subclass sgd of <class 'allennlp.training.optimizers.Optimizer'>
23:01:11 - INFO - allennlp.training.trainer - Beginning training.
23:01:11 - INFO - allennlp.training.trainer - Epoch 0/1
23:01:11 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 684.044
23:01:11 - INFO - allennlp.training.trainer - Training
23:01:11 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 287}}
23:01:11 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:13 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 110}}
23:01:13 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:13 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:13 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:14 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:14 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:01:15 - INFO - allennlp.training.trainer - Validating
23:01:15 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:15 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:16 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:16 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:16 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:16 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:17 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:17 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:01:17 - INFO - allennlp.training.trainer -                       Training |  Validation
23:01:17 - INFO - allennlp.training.trainer - denotation_acc    |     0.000  |     0.000
23:01:17 - INFO - allennlp.training.trainer - valid_sql_query   |     0.000  |     0.000
23:01:17 - INFO - allennlp.training.trainer - loss              |   210.595  |   156.398
23:01:17 - INFO - allennlp.training.trainer - action_similarity |     0.000  |     0.000
23:01:17 - INFO - allennlp.training.trainer - exact_match       |     0.000  |     0.000
23:01:17 - INFO - allennlp.training.trainer - Best validation performance so far. Copying weights to '/tmp/allennlp_testsjckp_bka/save_and_load_test/best.th'.
23:01:17 - INFO - allennlp.training.trainer - Epoch duration: 00:00:05
23:01:17 - INFO - allennlp.training.trainer - Estimated training time remaining: 0:00:05
23:01:17 - INFO - allennlp.training.trainer - Epoch 1/1
23:01:17 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 698.0
23:01:17 - INFO - allennlp.training.trainer - Training
23:01:17 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 202}}
23:01:17 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:18 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 287}}
23:01:18 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:20 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:20 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:21 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:21 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:01:21 - INFO - allennlp.training.trainer - Validating
23:01:21 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:21 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:22 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:22 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:22 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:22 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:23 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:23 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:01:23 - INFO - allennlp.training.trainer -                       Training |  Validation
23:01:23 - INFO - allennlp.training.trainer - denotation_acc    |     0.000  |     0.000
23:01:23 - INFO - allennlp.training.trainer - valid_sql_query   |     0.000  |     0.000
23:01:23 - INFO - allennlp.training.trainer - loss              |   160.465  |   164.938
23:01:23 - INFO - allennlp.training.trainer - action_similarity |     0.000  |     0.000
23:01:23 - INFO - allennlp.training.trainer - exact_match       |     0.000  |     0.000
23:01:23 - INFO - allennlp.training.trainer - Epoch duration: 00:00:06
23:01:23 - INFO - allennlp.models.archival - archiving weights and vocabulary to /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz
23:01:23 - INFO - allennlp.commands.train - Loading the best epoch weights.
23:01:23 - INFO - allennlp.common.util - Metrics: {
  "training_duration": "00:00:11",
  "training_start_epoch": 0,
  "training_epochs": 1,
  "epoch": 1,
  "training_exact_match": 0,
  "training_denotation_acc": 0,
  "training_valid_sql_query": 0,
  "training_action_similarity": 0,
  "training_loss": 160.4649257659912,
  "validation_exact_match": 0.0,
  "validation_denotation_acc": 0.0,
  "validation_valid_sql_query": 0.0,
  "validation_action_similarity": 0.0,
  "validation_loss": 164.93781661987305,
  "best_epoch": 0,
  "best_validation_exact_match": 0.0,
  "best_validation_denotation_acc": 0.0,
  "best_validation_valid_sql_query": 0.0,
  "best_validation_action_similarity": 0.0,
  "best_validation_loss": 156.39822578430176
}
23:01:23 - INFO - allennlp.models.archival - loading archive file /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz from cache at /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz
23:01:23 - INFO - allennlp.models.archival - extracting archive file /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz to temp dir /tmp/tmp_ax2l5vy
23:01:23 - DEBUG - allennlp.common.registrable - instantiating registered subclass atis_parser of <class 'allennlp.models.model.Model'>
23:01:23 - DEBUG - allennlp.common.registrable - instantiating registered subclass default of <class 'allennlp.data.vocabulary.Vocabulary'>
23:01:23 - INFO - allennlp.data.vocabulary - Loading token dictionary from /tmp/tmp_ax2l5vy/vocabulary.
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'type': 'atis'} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db'} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.data_iterator.DataIterator'> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.basic_iterator.BasicIterator'> from params {'batch_size': 4} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.data_iterator.DataIterator'> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.basic_iterator.BasicIterator'> from params {'batch_size': 4} and extras {}
23:01:23 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:24 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:24 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:24 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:24 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:24 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
_____________________________________________________________________________________ BiaffineDependencyParserTest.test_mst_decodes_arc_labels_with_respect_to_unconstrained_scores _____________________________________________________________________________________

self = <allennlp.tests.models.biaffine_dependency_parser_test.BiaffineDependencyParserTest testMethod=test_mst_decodes_arc_labels_with_respect_to_unconstrained_scores>

    def test_mst_decodes_arc_labels_with_respect_to_unconstrained_scores(self):
        energy = torch.Tensor([[0, 2, 1],
                               [10, 0, 0.5],
                               [9, 0.2, 0]]).view(1, 1, 3, 3).expand(1, 2, 3, 3).contiguous()
        # Make the score for the root label for arcs to the root token be higher - it
        # will be masked for the MST, but we want to make sure that the tags are with
        # respect to the unmasked tensor. If the masking was incorrect, we would decode all
        # zeros as the labels, because torch takes the first index in the case that all the
        # values are equal, which would be the case if the labels were calculated from
        # the masked score.
        energy[:, 1, 0, :] = 3
        length = torch.LongTensor([3])
        heads, tags = self.model._run_mst_decoding(energy, length) # pylint: disable=protected-access
        assert heads.tolist()[0] == [0, 0, 1]
>       assert tags.tolist()[0] == [0, 1, 0]
E       AssertionError: assert [0, 1, 1] == [0, 1, 0]
E         At index 2 diff: 1 != 0
E         Use -v to get the full diff

allennlp/tests/models/biaffine_dependency_parser_test.py:73: AssertionError
------------------------------------------------------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------------------------------------------------------
4it [00:00, 1511.05it/s]
100%|██████████| 4/4 [00:00<00:00, 28876.45it/s]
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:01:27 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'type': 'universal_dependencies'} and extras {}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.universal_dependencies.UniversalDependenciesDatasetReader'> from params {} and extras {}
23:01:27 - INFO - allennlp.data.dataset_readers.universal_dependencies - Reading UD instances from conllu dataset at: /home/michael/allennlp/allennlp/tests/fixtures/data/dependencies.conllu
23:01:27 - INFO - allennlp.data.vocabulary - Fitting token dictionary from dataset.
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'arc_representation_dim': 3, 'encoder': {'hidden_size': 4, 'input_size': 2, 'num_layers': 1, 'type': 'lstm'}, 'tag_representation_dim': 3, 'text_field_embedder': {'tokens': {'embedding_dim': 2, 'trainable': True, 'type': 'embedding'}}, 'type': 'biaffine_parser'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e342b70>}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.biaffine_dependency_parser.BiaffineDependencyParser'> from params {'arc_representation_dim': 3, 'encoder': {'hidden_size': 4, 'input_size': 2, 'num_layers': 1, 'type': 'lstm'}, 'tag_representation_dim': 3, 'text_field_embedder': {'tokens': {'embedding_dim': 2, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e342b70>}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 2, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e342b70>}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 2, 'trainable': True, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e342b70>}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'hidden_size': 4, 'input_size': 2, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e342b70>}
23:01:27 - DEBUG - allennlp.common.registrable - instantiating registered subclass elu of <class 'allennlp.nn.activations.Activation'>
23:01:27 - DEBUG - allennlp.common.registrable - instantiating registered subclass linear of <class 'allennlp.nn.activations.Activation'>
23:01:27 - DEBUG - allennlp.common.registrable - instantiating registered subclass elu of <class 'allennlp.nn.activations.Activation'>
23:01:27 - INFO - allennlp.models.biaffine_dependency_parser - Found POS tags correspoding to the following punctuation : {'PUNCT': 3}. Ignoring words with these POS tags for evaluation.
______________________________________________________________________________________________ TestNnUtil.test_sequence_cross_entropy_with_logits_averages_batch_correctly ______________________________________________________________________________________________

self = <allennlp.tests.nn.util_test.TestNnUtil testMethod=test_sequence_cross_entropy_with_logits_averages_batch_correctly>

    def test_sequence_cross_entropy_with_logits_averages_batch_correctly(self):
        # test batch average is the same as dividing the batch averaged
        # loss by the number of batches containing any non-padded tokens.
        tensor = torch.rand([5, 7, 4])
        tensor[0, 3:, :] = 0
        tensor[1, 4:, :] = 0
        tensor[2, 2:, :] = 0
        tensor[3, :, :] = 0
        weights = (tensor != 0.0)[:, :, 0].long().squeeze(-1)
        targets = torch.LongTensor(numpy.random.randint(0, 3, [5, 7]))
        targets *= weights

        loss = util.sequence_cross_entropy_with_logits(tensor, targets, weights)

        vector_loss = util.sequence_cross_entropy_with_logits(tensor, targets, weights, average=None)
        # Batch has one completely padded row, so divide by 4.
>       assert loss.data.numpy() == vector_loss.data.sum() / 4
E       TypeError: eq() received an invalid combination of arguments - got (numpy.ndarray), but expected one of:
E        * (Tensor other)
E             didn't match because some of the arguments have invalid types: (!numpy.ndarray!)
E        * (Number other)
E             didn't match because some of the arguments have invalid types: (!numpy.ndarray!)

allennlp/tests/nn/util_test.py:518: TypeError
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:21:14 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
______________________________________________________________________________________________ TestNnUtil.test_sequence_cross_entropy_with_logits_averages_token_correctly ______________________________________________________________________________________________

self = <allennlp.tests.nn.util_test.TestNnUtil testMethod=test_sequence_cross_entropy_with_logits_averages_token_correctly>

    def test_sequence_cross_entropy_with_logits_averages_token_correctly(self):
        # test token average is the same as multiplying the per-batch loss
        # with the per-batch weights and dividing by the total weight
        tensor = torch.rand([5, 7, 4])
        tensor[0, 3:, :] = 0
        tensor[1, 4:, :] = 0
        tensor[2, 2:, :] = 0
        tensor[3, :, :] = 0
        weights = (tensor != 0.0)[:, :, 0].long().squeeze(-1)
        targets = torch.LongTensor(numpy.random.randint(0, 3, [5, 7]))
        targets *= weights

        loss = util.sequence_cross_entropy_with_logits(tensor, targets, weights, average="token")

        vector_loss = util.sequence_cross_entropy_with_logits(tensor, targets, weights, batch_average=False)
        total_token_loss = (vector_loss * weights.float().sum(dim=-1)).sum()
        average_token_loss = (total_token_loss / weights.float().sum()).detach()
>       assert_almost_equal(loss.detach()[0], average_token_loss[0])
E       IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

allennlp/tests/nn/util_test.py:537: IndexError
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:21:14 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
____________________________________________________________________________________________________________________ TestNnUtil.test_viterbi_decode _____________________________________________________________________________________________________________________

self = <allennlp.tests.nn.util_test.TestNnUtil testMethod=test_viterbi_decode>

    def test_viterbi_decode(self):
        # Test Viterbi decoding is equal to greedy decoding with no pairwise potentials.
        sequence_logits = torch.nn.functional.softmax(torch.rand([5, 9]), dim=-1)
        transition_matrix = torch.zeros([9, 9])
        indices, _ = util.viterbi_decode(sequence_logits.data, transition_matrix)
        _, argmax_indices = torch.max(sequence_logits, 1)
        assert indices == argmax_indices.data.squeeze().tolist()

        # Test that pairwise potentials effect the sequence correctly and that
        # viterbi_decode can handle -inf values.
        sequence_logits = torch.FloatTensor([[0, 0, 0, 3, 4],
                                             [0, 0, 0, 3, 4],
                                             [0, 0, 0, 3, 4],
                                             [0, 0, 0, 3, 4],
                                             [0, 0, 0, 3, 4],
                                             [0, 0, 0, 3, 4]])
        # The same tags shouldn't appear sequentially.
        transition_matrix = torch.zeros([5, 5])
        for i in range(5):
            transition_matrix[i, i] = float("-inf")
        indices, _ = util.viterbi_decode(sequence_logits, transition_matrix)
>       assert indices == [4, 3, 4, 3, 4, 3]
E       AssertionError: assert [3, 4, 3, 4, 3, 4] == [4, 3, 4, 3, 4, 3]
E         At index 0 diff: 3 != 4
E         Use -v to get the full diff

allennlp/tests/nn/util_test.py:408: AssertionError
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:21:14 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
_______________________________________________________________________________________________ TestDenseSparseAdam.test_can_optimise_model_with_dense_and_sparse_params ________________________________________________________________________________________________

self = <allennlp.tests.training.optimizer_test.TestDenseSparseAdam testMethod=test_can_optimise_model_with_dense_and_sparse_params>

    def test_can_optimise_model_with_dense_and_sparse_params(self):
        optimizer_params = Params({
                "type": "dense_sparse_adam"
        })
        parameters = [[n, p] for n, p in self.model.named_parameters() if p.requires_grad]
        optimizer = Optimizer.from_params(parameters, optimizer_params)
        iterator = BasicIterator(2)
        iterator.index_with(self.vocab)
>       Trainer(self.model, optimizer, iterator, self.instances).train()

allennlp/tests/training/optimizer_test.py:107: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
allennlp/training/trainer.py:752: in train
    train_metrics = self._train_epoch(epoch)
allennlp/training/trainer.py:522: in _train_epoch
    self.optimizer.step()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = DenseSparseAdam (
Parameter Group 0
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 0.001
), closure = None

    def step(self, closure=None):
        """
        Performs a single optimization step.

        Parameters
        ----------
        closure : ``callable``, optional.
            A closure that reevaluates the model and returns the loss.
        """
        loss = None
        if closure is not None:
            loss = closure()

        for group in self.param_groups:
            for p in group['params']:
                if p.grad is None:
                    continue
                grad = p.grad.data

                state = self.state[p]

                # State initialization
                if len(state) == 0:
                    state['step'] = 0
                    # Exponential moving average of gradient values
                    state['exp_avg'] = torch.zeros_like(p.data)
                    # Exponential moving average of squared gradient values
                    state['exp_avg_sq'] = torch.zeros_like(p.data)

                state['step'] += 1

                exp_avg, exp_avg_sq = state['exp_avg'], state['exp_avg_sq']
                beta1, beta2 = group['betas']

                if grad.is_sparse:
                    grad = grad.coalesce()  # the update is non-linear so indices must be unique
                    grad_indices = grad._indices()
                    grad_values = grad._values()
                    size = grad.size()

                    def make_sparse(values):
                        constructor = grad.new
                        if grad_indices.dim() == 0 or values.dim() == 0:
                            return constructor().resize_as_(grad)
                        return constructor(grad_indices, values, size)

                    # Decay the first and second moment running average coefficient
                    #      old <- b * old + (1 - b) * new
                    # <==> old += (1 - b) * (new - old)
>                   old_exp_avg_values = exp_avg._sparse_mask(grad)._values()
E                   AttributeError: 'Tensor' object has no attribute '_sparse_mask'

allennlp/training/optimizers.py:224: AttributeError
------------------------------------------------------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------------------------------------------------------
4it [00:00, 1393.92it/s]
100%|██████████| 4/4 [00:00<00:00, 49784.02it/s]
  0%|          | 0/2 [00:00<?, ?it/s]
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:21:35 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:21:35 - INFO - allennlp.data.dataset_readers.sequence_tagging - Reading instances from lines in file at: /home/michael/allennlp/allennlp/tests/fixtures/data/sequence_tagging.tsv
23:21:35 - INFO - allennlp.data.vocabulary - Fitting token dictionary from dataset.
23:21:35 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.simple_tagger.SimpleTagger'> from params {'text_field_embedder': {'tokens': {'type': 'embedding', 'embedding_dim': 5, 'sparse': True}}, 'encoder': {'type': 'lstm', 'input_size': 5, 'hidden_size': 7, 'num_layers': 2}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f3bc45f60>}
23:21:35 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'type': 'embedding', 'embedding_dim': 5, 'sparse': True}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f3bc45f60>}
23:21:35 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'type': 'embedding', 'embedding_dim': 5, 'sparse': True} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f3bc45f60>}
23:21:35 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'type': 'lstm', 'input_size': 5, 'hidden_size': 7, 'num_layers': 2} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f3bc45f60>}
23:21:35 - INFO - allennlp.training.optimizers - Number of trainable parameters: 901
23:21:35 - DEBUG - allennlp.common.registrable - instantiating registered subclass dense_sparse_adam of <class 'allennlp.training.optimizers.Optimizer'>
23:21:35 - INFO - allennlp.training.trainer - Beginning training.
23:21:35 - INFO - allennlp.training.trainer - Epoch 0/19
23:21:35 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 1657.16
23:21:35 - INFO - allennlp.training.trainer - Training
23:21:35 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'tokens': {'num_tokens': 4}, 'tags': {'num_tokens': 4}}
23:21:35 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 2
allenai / allennlp

update AllenNLP for pytorch 1.0 #1803