allenai / allennlp

An open-source NLP research library, built on PyTorch.
http://www.allennlp.org
Apache License 2.0
11.75k stars 2.25k forks source link

update AllenNLP for pytorch 1.0 #1803

Closed joelgrus closed 5 years ago

joelgrus commented 6 years ago

Motivation

PyTorch 1.0 should be released soon (at the PyTorch dev conference in early October)?

it's almost certain that it will require changes to the AllenNLP library, just as Pytorch 0.3 -> Pytorch 0.4 did.

This is just a placeholder so that we keep track of that this work will have to be done in Q4

Success Criteria

a new version (0.7? 1.0?) of AllenNLP that works with PyTorch 1.0

schmmd commented 6 years ago

See also: https://github.com/allenai/allennlp/issues/1878

schmmd commented 6 years ago

PyTorch 1.0 will be released by early December. Currently a preview is available that is used by Facebook in production.

schmmd commented 5 years ago

82bbee7 with nightly:

==== 6 failed, 1068 passed, 18 skipped, 745 warnings in 1314.30 seconds ====

versus (from our CI)

==== 1056 passed, 15 skipped, 21 deselected, 373 warnings in 304.54 seconds ====

schmmd commented 5 years ago

The main new warning is:

UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than tensor.new_tensor(sourceTensor).

And here are the failures:

allennlp/tests/nn/util_test.py ............................FF...F.... 
allennlp/tests/models/biaffine_dependency_parser_test.py ...F..
allennlp/tests/models/atis_semantic_parser_test.py .F
allennlp/tests/training/optimizer_test.py ..F

=============================================================================================================================== FAILURES ================================================================================================================================
____________________________________________________________________________________________________ AtisSemanticParserTest.test_atis_model_can_train_save_and_load _____________________________________________________________________________________________________

self = <allennlp.tests.models.atis_semantic_parser_test.AtisSemanticParserTest testMethod=test_atis_model_can_train_save_and_load>

    @flaky
    def test_atis_model_can_train_save_and_load(self):
>       self.ensure_model_can_train_save_and_load(self.param_file)

allennlp/tests/models/atis_semantic_parser_test.py:15: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
allennlp/common/testing/model_test_case.py:106: in ensure_model_can_train_save_and_load
    self.check_model_computes_gradients_correctly(model, model_batch, gradients_to_ignore)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

model = AtisSemanticParser(
  (_utterance_embedder): BasicTextFieldEmbedder(
    (token_embedder_tokens): Embedding()
  )
  (_...s=160, out_features=10, bias=True)
    (_decoder_cell): LSTM(80, 80, num_layers=2)
    (_dropout): Dropout(p=0.5)
  )
)
model_batch = {'actions': [[ProductionRule(rule='agg -> [agg_func, "(", col, ")"]', is_global_rule=True, rule_id=tensor([0]), nonter...    [[1174],
         [1103],
         [ 769],
         ...,
         [ 302],
         [ 335],
         [ 432]]]), ...}
params_to_ignore = None

    @staticmethod
    def check_model_computes_gradients_correctly(model: Model,
                                                 model_batch: Dict[str, Union[Any, Dict[str, Any]]],
                                                 params_to_ignore: Set[str] = None):
        print("Checking gradients")
        model.zero_grad()
        result = model(**model_batch)
        result["loss"].backward()
        has_zero_or_none_grads = {}
        for name, parameter in model.named_parameters():
            zeros = torch.zeros(parameter.size())
            if params_to_ignore and name in params_to_ignore:
                continue
            if parameter.requires_grad:

                if parameter.grad is None:
                    has_zero_or_none_grads[name] = "No gradient computed (i.e parameter.grad is None)"

                elif parameter.grad.is_sparse or parameter.grad.data.is_sparse:
                    pass

                # Some parameters will only be partially updated,
                # like embeddings, so we just check that any gradient is non-zero.
                elif (parameter.grad.cpu() == zeros).all():
                    has_zero_or_none_grads[name] = f"zeros with shape ({tuple(parameter.grad.size())})"
            else:
                assert parameter.grad is None

        if has_zero_or_none_grads:
            for name, grad in has_zero_or_none_grads.items():
                print(f"Parameter: {name} had incorrect gradient: {grad}")
>           raise Exception("Incorrect gradients found. See stdout for more info.")
E           Exception: Incorrect gradients found. See stdout for more info.

allennlp/common/testing/model_test_case.py:198: Exception
------------------------------------------------------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------------------------------------------------------
Checking gradients
Parameter: _entity_type_decoder_embedding.weight had incorrect gradient: No gradient computed (i.e parameter.grad is None)
------------------------------------------------------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------------------------------------------------------
13it [00:00, 26.42it/s]
100%|██████████| 13/13 [00:00<00:00, 2905.11it/s]
0it [00:00, ?it/s]
13it [00:00, 25.48it/s]

0it [00:00, ?it/s]
13it [00:00, 38.07it/s]

0it [00:00, ?it/s]
26it [00:00, 3036.30it/s]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 210.5952 ||: 100%|##########| 4/4 [00:03<00:00,  1.04it/s]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 156.3982 ||: 100%|##########| 4/4 [00:01<00:00,  2.31it/s]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 160.4649 ||: 100%|##########| 4/4 [00:03<00:00,  1.06it/s]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 164.9378 ||: 100%|##########| 4/4 [00:01<00:00,  2.30it/s]

0it [00:00, ?it/s]
13it [00:00, 25.30it/s]

0it [00:00, ?it/s]
13it [00:00, 25.96it/s]

--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:00:53 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:00:53 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'type': 'atis'} and extras {}
23:00:53 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db'} and extras {}
23:00:53 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : /home/michael/allennlp/allennlp/tests/fixtures/data/atis/sample.json
23:00:54 - INFO - allennlp.data.vocabulary - Fitting token dictionary from dataset.
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'type': 'atis'} and extras {}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db'} and extras {}
23:00:54 - INFO - allennlp.commands.train - Using a separate dataset reader to load validation and test data.
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'keep_if_unparseable': True, 'type': 'atis'} and extras {}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'keep_if_unparseable': True} and extras {}
23:00:54 - INFO - allennlp.commands.train - Reading training data from allennlp/tests/fixtures/data/atis/sample.json
23:00:54 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:00:54 - INFO - allennlp.commands.train - Reading validation data from allennlp/tests/fixtures/data/atis/sample.json
23:00:54 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:00:55 - INFO - allennlp.commands.train - From dataset instances, train, validation will be considered for vocabulary creation.
23:00:55 - INFO - allennlp.data.vocabulary - Fitting token dictionary from dataset.
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.data_iterator.DataIterator'> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.basic_iterator.BasicIterator'> from params {'batch_size': 4} and extras {}
23:00:55 - INFO - allennlp.commands.train - Following parameters are Frozen  (without gradient):
23:00:55 - INFO - allennlp.commands.train - Following parameters are Tunable (with gradient):
23:00:55 - INFO - allennlp.commands.train - _first_action_embedding
23:00:55 - INFO - allennlp.commands.train - _first_attended_utterance
23:00:55 - INFO - allennlp.commands.train - _utterance_embedder.token_embedder_tokens.weight
23:00:55 - INFO - allennlp.commands.train - _encoder._module.weight_ih_l0
23:00:55 - INFO - allennlp.commands.train - _encoder._module.weight_hh_l0
23:00:55 - INFO - allennlp.commands.train - _encoder._module.bias_ih_l0
23:00:55 - INFO - allennlp.commands.train - _encoder._module.bias_hh_l0
23:00:55 - INFO - allennlp.commands.train - _encoder._module.weight_ih_l0_reverse
23:00:55 - INFO - allennlp.commands.train - _encoder._module.weight_hh_l0_reverse
23:00:55 - INFO - allennlp.commands.train - _encoder._module.bias_ih_l0_reverse
23:00:55 - INFO - allennlp.commands.train - _encoder._module.bias_hh_l0_reverse
23:00:55 - INFO - allennlp.commands.train - _action_embedder.weight
23:00:55 - INFO - allennlp.commands.train - _output_action_embedder.weight
23:00:55 - INFO - allennlp.commands.train - _entity_type_decoder_embedding.weight
23:00:55 - INFO - allennlp.commands.train - _transition_function._input_projection_layer.weight
23:00:55 - INFO - allennlp.commands.train - _transition_function._input_projection_layer.bias
23:00:55 - INFO - allennlp.commands.train - _transition_function._output_projection_layer.weight
23:00:55 - INFO - allennlp.commands.train - _transition_function._output_projection_layer.bias
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_ih_l0
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_hh_l0
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_ih_l0
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_hh_l0
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_ih_l1
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_hh_l1
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_ih_l1
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_hh_l1
23:00:55 - DEBUG - allennlp.common.registrable - instantiating registered subclass default of <class 'allennlp.training.trainer.Trainer'>
23:00:55 - INFO - allennlp.training.optimizers - Number of trainable parameters: 139600
23:00:55 - DEBUG - allennlp.common.registrable - instantiating registered subclass sgd of <class 'allennlp.training.optimizers.Optimizer'>
23:00:55 - INFO - allennlp.training.trainer - Beginning training.
23:00:55 - INFO - allennlp.training.trainer - Epoch 0/1
23:00:55 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 601.524
23:00:55 - INFO - allennlp.training.trainer - Training
23:00:55 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 287}}
23:00:55 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:00:56 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 110}}
23:00:56 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:00:57 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:00:57 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:00:58 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:00:58 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:00:59 - INFO - allennlp.training.trainer - Validating
23:00:59 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:00:59 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:00:59 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:00:59 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:00 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:00 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:00 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:00 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:01:01 - INFO - allennlp.training.trainer -                       Training |  Validation
23:01:01 - INFO - allennlp.training.trainer - denotation_acc    |     0.000  |     0.000
23:01:01 - INFO - allennlp.training.trainer - valid_sql_query   |     0.000  |     0.000
23:01:01 - INFO - allennlp.training.trainer - loss              |   210.595  |   156.398
23:01:01 - INFO - allennlp.training.trainer - action_similarity |     0.000  |     0.000
23:01:01 - INFO - allennlp.training.trainer - exact_match       |     0.000  |     0.000
23:01:01 - INFO - allennlp.training.trainer - Best validation performance so far. Copying weights to '/tmp/allennlp_testsjckp_bka/save_and_load_test/best.th'.
23:01:01 - INFO - allennlp.training.trainer - Epoch duration: 00:00:05
23:01:01 - INFO - allennlp.training.trainer - Estimated training time remaining: 0:00:05
23:01:01 - INFO - allennlp.training.trainer - Epoch 1/1
23:01:01 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 638.16
23:01:01 - INFO - allennlp.training.trainer - Training
23:01:01 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 202}}
23:01:01 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:02 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 287}}
23:01:02 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:03 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:03 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:04 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:04 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:01:04 - INFO - allennlp.training.trainer - Validating
23:01:04 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:04 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:05 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:05 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:06 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:06 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:06 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:06 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:01:06 - INFO - allennlp.training.trainer -                       Training |  Validation
23:01:06 - INFO - allennlp.training.trainer - denotation_acc    |     0.000  |     0.000
23:01:06 - INFO - allennlp.training.trainer - valid_sql_query   |     0.000  |     0.000
23:01:06 - INFO - allennlp.training.trainer - loss              |   160.465  |   164.938
23:01:06 - INFO - allennlp.training.trainer - action_similarity |     0.000  |     0.000
23:01:06 - INFO - allennlp.training.trainer - exact_match       |     0.000  |     0.000
23:01:06 - INFO - allennlp.training.trainer - Epoch duration: 00:00:05
23:01:06 - INFO - allennlp.models.archival - archiving weights and vocabulary to /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz
23:01:06 - INFO - allennlp.commands.train - Loading the best epoch weights.
23:01:06 - INFO - allennlp.common.util - Metrics: {
  "training_duration": "00:00:11",
  "training_start_epoch": 0,
  "training_epochs": 1,
  "epoch": 1,
  "training_exact_match": 0,
  "training_denotation_acc": 0,
  "training_valid_sql_query": 0,
  "training_action_similarity": 0,
  "training_loss": 160.4649143218994,
  "validation_exact_match": 0.0,
  "validation_denotation_acc": 0.0,
  "validation_valid_sql_query": 0.0,
  "validation_action_similarity": 0.0,
  "validation_loss": 164.93781661987305,
  "best_epoch": 0,
  "best_validation_exact_match": 0.0,
  "best_validation_denotation_acc": 0.0,
  "best_validation_valid_sql_query": 0.0,
  "best_validation_action_similarity": 0.0,
  "best_validation_loss": 156.3982391357422
}
23:01:06 - INFO - allennlp.models.archival - loading archive file /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz from cache at /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz
23:01:06 - INFO - allennlp.models.archival - extracting archive file /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz to temp dir /tmp/tmp2o3wao_0
23:01:06 - DEBUG - allennlp.common.registrable - instantiating registered subclass atis_parser of <class 'allennlp.models.model.Model'>
23:01:06 - DEBUG - allennlp.common.registrable - instantiating registered subclass default of <class 'allennlp.data.vocabulary.Vocabulary'>
23:01:06 - INFO - allennlp.data.vocabulary - Loading token dictionary from /tmp/tmp2o3wao_0/vocabulary.
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e2a4978>}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'type': 'atis'} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db'} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.data_iterator.DataIterator'> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.basic_iterator.BasicIterator'> from params {'batch_size': 4} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.data_iterator.DataIterator'> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.basic_iterator.BasicIterator'> from params {'batch_size': 4} and extras {}
23:01:07 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:07 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:07 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:07 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:08 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:08 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
------------------------------------------------------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------------------------------------------------------
Checking gradients
Parameter: _entity_type_decoder_embedding.weight had incorrect gradient: No gradient computed (i.e parameter.grad is None)
------------------------------------------------------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------------------------------------------------------
13it [00:00, 36.74it/s]
100%|██████████| 13/13 [00:00<00:00, 2902.32it/s]
0it [00:00, ?it/s]
13it [00:00, 23.84it/s]

0it [00:00, ?it/s]
13it [00:00, 37.47it/s]

0it [00:00, ?it/s]
26it [00:00, 2951.98it/s]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 210.5952 ||: 100%|##########| 4/4 [00:03<00:00,  1.07it/s]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 156.3982 ||: 100%|##########| 4/4 [00:01<00:00,  2.09it/s]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 160.4649 ||: 100%|##########| 4/4 [00:04<00:00,  1.04s/it]

  0%|          | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 164.9378 ||: 100%|##########| 4/4 [00:01<00:00,  2.09it/s]

0it [00:00, ?it/s]
13it [00:00, 39.00it/s]

0it [00:00, ?it/s]
13it [00:00, 24.27it/s]

--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:01:09 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:01:09 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'type': 'atis'} and extras {}
23:01:09 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db'} and extras {}
23:01:09 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : /home/michael/allennlp/allennlp/tests/fixtures/data/atis/sample.json
23:01:10 - INFO - allennlp.data.vocabulary - Fitting token dictionary from dataset.
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'type': 'atis'} and extras {}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db'} and extras {}
23:01:10 - INFO - allennlp.commands.train - Using a separate dataset reader to load validation and test data.
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'keep_if_unparseable': True, 'type': 'atis'} and extras {}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'keep_if_unparseable': True} and extras {}
23:01:10 - INFO - allennlp.commands.train - Reading training data from allennlp/tests/fixtures/data/atis/sample.json
23:01:10 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:10 - INFO - allennlp.commands.train - Reading validation data from allennlp/tests/fixtures/data/atis/sample.json
23:01:11 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:11 - INFO - allennlp.commands.train - From dataset instances, train, validation will be considered for vocabulary creation.
23:01:11 - INFO - allennlp.data.vocabulary - Fitting token dictionary from dataset.
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.data_iterator.DataIterator'> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.basic_iterator.BasicIterator'> from params {'batch_size': 4} and extras {}
23:01:11 - INFO - allennlp.commands.train - Following parameters are Frozen  (without gradient):
23:01:11 - INFO - allennlp.commands.train - Following parameters are Tunable (with gradient):
23:01:11 - INFO - allennlp.commands.train - _first_action_embedding
23:01:11 - INFO - allennlp.commands.train - _first_attended_utterance
23:01:11 - INFO - allennlp.commands.train - _utterance_embedder.token_embedder_tokens.weight
23:01:11 - INFO - allennlp.commands.train - _encoder._module.weight_ih_l0
23:01:11 - INFO - allennlp.commands.train - _encoder._module.weight_hh_l0
23:01:11 - INFO - allennlp.commands.train - _encoder._module.bias_ih_l0
23:01:11 - INFO - allennlp.commands.train - _encoder._module.bias_hh_l0
23:01:11 - INFO - allennlp.commands.train - _encoder._module.weight_ih_l0_reverse
23:01:11 - INFO - allennlp.commands.train - _encoder._module.weight_hh_l0_reverse
23:01:11 - INFO - allennlp.commands.train - _encoder._module.bias_ih_l0_reverse
23:01:11 - INFO - allennlp.commands.train - _encoder._module.bias_hh_l0_reverse
23:01:11 - INFO - allennlp.commands.train - _action_embedder.weight
23:01:11 - INFO - allennlp.commands.train - _output_action_embedder.weight
23:01:11 - INFO - allennlp.commands.train - _entity_type_decoder_embedding.weight
23:01:11 - INFO - allennlp.commands.train - _transition_function._input_projection_layer.weight
23:01:11 - INFO - allennlp.commands.train - _transition_function._input_projection_layer.bias
23:01:11 - INFO - allennlp.commands.train - _transition_function._output_projection_layer.weight
23:01:11 - INFO - allennlp.commands.train - _transition_function._output_projection_layer.bias
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_ih_l0
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_hh_l0
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_ih_l0
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_hh_l0
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_ih_l1
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_hh_l1
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_ih_l1
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_hh_l1
23:01:11 - DEBUG - allennlp.common.registrable - instantiating registered subclass default of <class 'allennlp.training.trainer.Trainer'>
23:01:11 - INFO - allennlp.training.optimizers - Number of trainable parameters: 139600
23:01:11 - DEBUG - allennlp.common.registrable - instantiating registered subclass sgd of <class 'allennlp.training.optimizers.Optimizer'>
23:01:11 - INFO - allennlp.training.trainer - Beginning training.
23:01:11 - INFO - allennlp.training.trainer - Epoch 0/1
23:01:11 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 684.044
23:01:11 - INFO - allennlp.training.trainer - Training
23:01:11 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 287}}
23:01:11 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:13 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 110}}
23:01:13 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:13 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:13 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:14 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:14 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:01:15 - INFO - allennlp.training.trainer - Validating
23:01:15 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:15 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:16 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:16 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:16 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:16 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:17 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:17 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:01:17 - INFO - allennlp.training.trainer -                       Training |  Validation
23:01:17 - INFO - allennlp.training.trainer - denotation_acc    |     0.000  |     0.000
23:01:17 - INFO - allennlp.training.trainer - valid_sql_query   |     0.000  |     0.000
23:01:17 - INFO - allennlp.training.trainer - loss              |   210.595  |   156.398
23:01:17 - INFO - allennlp.training.trainer - action_similarity |     0.000  |     0.000
23:01:17 - INFO - allennlp.training.trainer - exact_match       |     0.000  |     0.000
23:01:17 - INFO - allennlp.training.trainer - Best validation performance so far. Copying weights to '/tmp/allennlp_testsjckp_bka/save_and_load_test/best.th'.
23:01:17 - INFO - allennlp.training.trainer - Epoch duration: 00:00:05
23:01:17 - INFO - allennlp.training.trainer - Estimated training time remaining: 0:00:05
23:01:17 - INFO - allennlp.training.trainer - Epoch 1/1
23:01:17 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 698.0
23:01:17 - INFO - allennlp.training.trainer - Training
23:01:17 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 202}}
23:01:17 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:18 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 287}}
23:01:18 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:20 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:20 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:21 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:21 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:01:21 - INFO - allennlp.training.trainer - Validating
23:01:21 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:21 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:22 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:22 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:22 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:22 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:23 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:23 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 1
23:01:23 - INFO - allennlp.training.trainer -                       Training |  Validation
23:01:23 - INFO - allennlp.training.trainer - denotation_acc    |     0.000  |     0.000
23:01:23 - INFO - allennlp.training.trainer - valid_sql_query   |     0.000  |     0.000
23:01:23 - INFO - allennlp.training.trainer - loss              |   160.465  |   164.938
23:01:23 - INFO - allennlp.training.trainer - action_similarity |     0.000  |     0.000
23:01:23 - INFO - allennlp.training.trainer - exact_match       |     0.000  |     0.000
23:01:23 - INFO - allennlp.training.trainer - Epoch duration: 00:00:06
23:01:23 - INFO - allennlp.models.archival - archiving weights and vocabulary to /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz
23:01:23 - INFO - allennlp.commands.train - Loading the best epoch weights.
23:01:23 - INFO - allennlp.common.util - Metrics: {
  "training_duration": "00:00:11",
  "training_start_epoch": 0,
  "training_epochs": 1,
  "epoch": 1,
  "training_exact_match": 0,
  "training_denotation_acc": 0,
  "training_valid_sql_query": 0,
  "training_action_similarity": 0,
  "training_loss": 160.4649257659912,
  "validation_exact_match": 0.0,
  "validation_denotation_acc": 0.0,
  "validation_valid_sql_query": 0.0,
  "validation_action_similarity": 0.0,
  "validation_loss": 164.93781661987305,
  "best_epoch": 0,
  "best_validation_exact_match": 0.0,
  "best_validation_denotation_acc": 0.0,
  "best_validation_valid_sql_query": 0.0,
  "best_validation_action_similarity": 0.0,
  "best_validation_loss": 156.39822578430176
}
23:01:23 - INFO - allennlp.models.archival - loading archive file /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz from cache at /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz
23:01:23 - INFO - allennlp.models.archival - extracting archive file /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz to temp dir /tmp/tmp_ax2l5vy
23:01:23 - DEBUG - allennlp.common.registrable - instantiating registered subclass atis_parser of <class 'allennlp.models.model.Model'>
23:01:23 - DEBUG - allennlp.common.registrable - instantiating registered subclass default of <class 'allennlp.data.vocabulary.Vocabulary'>
23:01:23 - INFO - allennlp.data.vocabulary - Loading token dictionary from /tmp/tmp_ax2l5vy/vocabulary.
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db', 'type': 'atis'} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.semantic_parsing.atis.AtisDatasetReader'> from params {'database_file': 'https://s3-us-west-2.amazonaws.com/allennlp/datasets/atis/atis.db'} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.data_iterator.DataIterator'> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.basic_iterator.BasicIterator'> from params {'batch_size': 4} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.data_iterator.DataIterator'> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.basic_iterator.BasicIterator'> from params {'batch_size': 4} and extras {}
23:01:23 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:24 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:24 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
23:01:24 - INFO - allennlp.data.dataset_readers.semantic_parsing.atis - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:24 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:24 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 4
_____________________________________________________________________________________ BiaffineDependencyParserTest.test_mst_decodes_arc_labels_with_respect_to_unconstrained_scores _____________________________________________________________________________________

self = <allennlp.tests.models.biaffine_dependency_parser_test.BiaffineDependencyParserTest testMethod=test_mst_decodes_arc_labels_with_respect_to_unconstrained_scores>

    def test_mst_decodes_arc_labels_with_respect_to_unconstrained_scores(self):
        energy = torch.Tensor([[0, 2, 1],
                               [10, 0, 0.5],
                               [9, 0.2, 0]]).view(1, 1, 3, 3).expand(1, 2, 3, 3).contiguous()
        # Make the score for the root label for arcs to the root token be higher - it
        # will be masked for the MST, but we want to make sure that the tags are with
        # respect to the unmasked tensor. If the masking was incorrect, we would decode all
        # zeros as the labels, because torch takes the first index in the case that all the
        # values are equal, which would be the case if the labels were calculated from
        # the masked score.
        energy[:, 1, 0, :] = 3
        length = torch.LongTensor([3])
        heads, tags = self.model._run_mst_decoding(energy, length) # pylint: disable=protected-access
        assert heads.tolist()[0] == [0, 0, 1]
>       assert tags.tolist()[0] == [0, 1, 0]
E       AssertionError: assert [0, 1, 1] == [0, 1, 0]
E         At index 2 diff: 1 != 0
E         Use -v to get the full diff

allennlp/tests/models/biaffine_dependency_parser_test.py:73: AssertionError
------------------------------------------------------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------------------------------------------------------
4it [00:00, 1511.05it/s]
100%|██████████| 4/4 [00:00<00:00, 28876.45it/s]
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:01:27 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'type': 'universal_dependencies'} and extras {}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.universal_dependencies.UniversalDependenciesDatasetReader'> from params {} and extras {}
23:01:27 - INFO - allennlp.data.dataset_readers.universal_dependencies - Reading UD instances from conllu dataset at: /home/michael/allennlp/allennlp/tests/fixtures/data/dependencies.conllu
23:01:27 - INFO - allennlp.data.vocabulary - Fitting token dictionary from dataset.
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'arc_representation_dim': 3, 'encoder': {'hidden_size': 4, 'input_size': 2, 'num_layers': 1, 'type': 'lstm'}, 'tag_representation_dim': 3, 'text_field_embedder': {'tokens': {'embedding_dim': 2, 'trainable': True, 'type': 'embedding'}}, 'type': 'biaffine_parser'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e342b70>}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.biaffine_dependency_parser.BiaffineDependencyParser'> from params {'arc_representation_dim': 3, 'encoder': {'hidden_size': 4, 'input_size': 2, 'num_layers': 1, 'type': 'lstm'}, 'tag_representation_dim': 3, 'text_field_embedder': {'tokens': {'embedding_dim': 2, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e342b70>}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 2, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e342b70>}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 2, 'trainable': True, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e342b70>}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'hidden_size': 4, 'input_size': 2, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f6e342b70>}
23:01:27 - DEBUG - allennlp.common.registrable - instantiating registered subclass elu of <class 'allennlp.nn.activations.Activation'>
23:01:27 - DEBUG - allennlp.common.registrable - instantiating registered subclass linear of <class 'allennlp.nn.activations.Activation'>
23:01:27 - DEBUG - allennlp.common.registrable - instantiating registered subclass elu of <class 'allennlp.nn.activations.Activation'>
23:01:27 - INFO - allennlp.models.biaffine_dependency_parser - Found POS tags correspoding to the following punctuation : {'PUNCT': 3}. Ignoring words with these POS tags for evaluation.
______________________________________________________________________________________________ TestNnUtil.test_sequence_cross_entropy_with_logits_averages_batch_correctly ______________________________________________________________________________________________

self = <allennlp.tests.nn.util_test.TestNnUtil testMethod=test_sequence_cross_entropy_with_logits_averages_batch_correctly>

    def test_sequence_cross_entropy_with_logits_averages_batch_correctly(self):
        # test batch average is the same as dividing the batch averaged
        # loss by the number of batches containing any non-padded tokens.
        tensor = torch.rand([5, 7, 4])
        tensor[0, 3:, :] = 0
        tensor[1, 4:, :] = 0
        tensor[2, 2:, :] = 0
        tensor[3, :, :] = 0
        weights = (tensor != 0.0)[:, :, 0].long().squeeze(-1)
        targets = torch.LongTensor(numpy.random.randint(0, 3, [5, 7]))
        targets *= weights

        loss = util.sequence_cross_entropy_with_logits(tensor, targets, weights)

        vector_loss = util.sequence_cross_entropy_with_logits(tensor, targets, weights, average=None)
        # Batch has one completely padded row, so divide by 4.
>       assert loss.data.numpy() == vector_loss.data.sum() / 4
E       TypeError: eq() received an invalid combination of arguments - got (numpy.ndarray), but expected one of:
E        * (Tensor other)
E             didn't match because some of the arguments have invalid types: (!numpy.ndarray!)
E        * (Number other)
E             didn't match because some of the arguments have invalid types: (!numpy.ndarray!)

allennlp/tests/nn/util_test.py:518: TypeError
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:21:14 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
______________________________________________________________________________________________ TestNnUtil.test_sequence_cross_entropy_with_logits_averages_token_correctly ______________________________________________________________________________________________

self = <allennlp.tests.nn.util_test.TestNnUtil testMethod=test_sequence_cross_entropy_with_logits_averages_token_correctly>

    def test_sequence_cross_entropy_with_logits_averages_token_correctly(self):
        # test token average is the same as multiplying the per-batch loss
        # with the per-batch weights and dividing by the total weight
        tensor = torch.rand([5, 7, 4])
        tensor[0, 3:, :] = 0
        tensor[1, 4:, :] = 0
        tensor[2, 2:, :] = 0
        tensor[3, :, :] = 0
        weights = (tensor != 0.0)[:, :, 0].long().squeeze(-1)
        targets = torch.LongTensor(numpy.random.randint(0, 3, [5, 7]))
        targets *= weights

        loss = util.sequence_cross_entropy_with_logits(tensor, targets, weights, average="token")

        vector_loss = util.sequence_cross_entropy_with_logits(tensor, targets, weights, batch_average=False)
        total_token_loss = (vector_loss * weights.float().sum(dim=-1)).sum()
        average_token_loss = (total_token_loss / weights.float().sum()).detach()
>       assert_almost_equal(loss.detach()[0], average_token_loss[0])
E       IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

allennlp/tests/nn/util_test.py:537: IndexError
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:21:14 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
____________________________________________________________________________________________________________________ TestNnUtil.test_viterbi_decode _____________________________________________________________________________________________________________________

self = <allennlp.tests.nn.util_test.TestNnUtil testMethod=test_viterbi_decode>

    def test_viterbi_decode(self):
        # Test Viterbi decoding is equal to greedy decoding with no pairwise potentials.
        sequence_logits = torch.nn.functional.softmax(torch.rand([5, 9]), dim=-1)
        transition_matrix = torch.zeros([9, 9])
        indices, _ = util.viterbi_decode(sequence_logits.data, transition_matrix)
        _, argmax_indices = torch.max(sequence_logits, 1)
        assert indices == argmax_indices.data.squeeze().tolist()

        # Test that pairwise potentials effect the sequence correctly and that
        # viterbi_decode can handle -inf values.
        sequence_logits = torch.FloatTensor([[0, 0, 0, 3, 4],
                                             [0, 0, 0, 3, 4],
                                             [0, 0, 0, 3, 4],
                                             [0, 0, 0, 3, 4],
                                             [0, 0, 0, 3, 4],
                                             [0, 0, 0, 3, 4]])
        # The same tags shouldn't appear sequentially.
        transition_matrix = torch.zeros([5, 5])
        for i in range(5):
            transition_matrix[i, i] = float("-inf")
        indices, _ = util.viterbi_decode(sequence_logits, transition_matrix)
>       assert indices == [4, 3, 4, 3, 4, 3]
E       AssertionError: assert [3, 4, 3, 4, 3, 4] == [4, 3, 4, 3, 4, 3]
E         At index 0 diff: 3 != 4
E         Use -v to get the full diff

allennlp/tests/nn/util_test.py:408: AssertionError
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:21:14 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
_______________________________________________________________________________________________ TestDenseSparseAdam.test_can_optimise_model_with_dense_and_sparse_params ________________________________________________________________________________________________

self = <allennlp.tests.training.optimizer_test.TestDenseSparseAdam testMethod=test_can_optimise_model_with_dense_and_sparse_params>

    def test_can_optimise_model_with_dense_and_sparse_params(self):
        optimizer_params = Params({
                "type": "dense_sparse_adam"
        })
        parameters = [[n, p] for n, p in self.model.named_parameters() if p.requires_grad]
        optimizer = Optimizer.from_params(parameters, optimizer_params)
        iterator = BasicIterator(2)
        iterator.index_with(self.vocab)
>       Trainer(self.model, optimizer, iterator, self.instances).train()

allennlp/tests/training/optimizer_test.py:107: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
allennlp/training/trainer.py:752: in train
    train_metrics = self._train_epoch(epoch)
allennlp/training/trainer.py:522: in _train_epoch
    self.optimizer.step()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = DenseSparseAdam (
Parameter Group 0
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 0.001
), closure = None

    def step(self, closure=None):
        """
        Performs a single optimization step.

        Parameters
        ----------
        closure : ``callable``, optional.
            A closure that reevaluates the model and returns the loss.
        """
        loss = None
        if closure is not None:
            loss = closure()

        for group in self.param_groups:
            for p in group['params']:
                if p.grad is None:
                    continue
                grad = p.grad.data

                state = self.state[p]

                # State initialization
                if len(state) == 0:
                    state['step'] = 0
                    # Exponential moving average of gradient values
                    state['exp_avg'] = torch.zeros_like(p.data)
                    # Exponential moving average of squared gradient values
                    state['exp_avg_sq'] = torch.zeros_like(p.data)

                state['step'] += 1

                exp_avg, exp_avg_sq = state['exp_avg'], state['exp_avg_sq']
                beta1, beta2 = group['betas']

                if grad.is_sparse:
                    grad = grad.coalesce()  # the update is non-linear so indices must be unique
                    grad_indices = grad._indices()
                    grad_values = grad._values()
                    size = grad.size()

                    def make_sparse(values):
                        constructor = grad.new
                        if grad_indices.dim() == 0 or values.dim() == 0:
                            return constructor().resize_as_(grad)
                        return constructor(grad_indices, values, size)

                    # Decay the first and second moment running average coefficient
                    #      old <- b * old + (1 - b) * new
                    # <==> old += (1 - b) * (new - old)
>                   old_exp_avg_values = exp_avg._sparse_mask(grad)._values()
E                   AttributeError: 'Tensor' object has no attribute '_sparse_mask'

allennlp/training/optimizers.py:224: AttributeError
------------------------------------------------------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------------------------------------------------------
4it [00:00, 1393.92it/s]
100%|██████████| 4/4 [00:00<00:00, 49784.02it/s]
  0%|          | 0/2 [00:00<?, ?it/s]
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:21:35 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:21:35 - INFO - allennlp.data.dataset_readers.sequence_tagging - Reading instances from lines in file at: /home/michael/allennlp/allennlp/tests/fixtures/data/sequence_tagging.tsv
23:21:35 - INFO - allennlp.data.vocabulary - Fitting token dictionary from dataset.
23:21:35 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.simple_tagger.SimpleTagger'> from params {'text_field_embedder': {'tokens': {'type': 'embedding', 'embedding_dim': 5, 'sparse': True}}, 'encoder': {'type': 'lstm', 'input_size': 5, 'hidden_size': 7, 'num_layers': 2}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f3bc45f60>}
23:21:35 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'type': 'embedding', 'embedding_dim': 5, 'sparse': True}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f3bc45f60>}
23:21:35 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'type': 'embedding', 'embedding_dim': 5, 'sparse': True} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f3bc45f60>}
23:21:35 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'type': 'lstm', 'input_size': 5, 'hidden_size': 7, 'num_layers': 2} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x7f2f3bc45f60>}
23:21:35 - INFO - allennlp.training.optimizers - Number of trainable parameters: 901
23:21:35 - DEBUG - allennlp.common.registrable - instantiating registered subclass dense_sparse_adam of <class 'allennlp.training.optimizers.Optimizer'>
23:21:35 - INFO - allennlp.training.trainer - Beginning training.
23:21:35 - INFO - allennlp.training.trainer - Epoch 0/19
23:21:35 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 1657.16
23:21:35 - INFO - allennlp.training.trainer - Training
23:21:35 - DEBUG - allennlp.data.iterators.data_iterator - Batch padding lengths: {'tokens': {'num_tokens': 4}, 'tags': {'num_tokens': 4}}
23:21:35 - DEBUG - allennlp.data.iterators.data_iterator - Batch size: 2
schmmd commented 5 years ago

Fixed in https://github.com/allenai/allennlp/pull/2165