Closed joelgrus closed 5 years ago
PyTorch 1.0 will be released by early December. Currently a preview is available that is used by Facebook in production.
82bbee7 with nightly:
==== 6 failed, 1068 passed, 18 skipped, 745 warnings in 1314.30 seconds ====
versus (from our CI)
==== 1056 passed, 15 skipped, 21 deselected, 373 warnings in 304.54 seconds ====
The main new warning is:
UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than tensor.new_tensor(sourceTensor).
And here are the failures:
allennlp/tests/nn/ ............................FF...F....
allennlp/tests/models/ ...F..
allennlp/tests/models/ .F
allennlp/tests/training/ ..F
=============================================================================================================================== FAILURES ================================================================================================================================
____________________________________________________________________________________________________ AtisSemanticParserTest.test_atis_model_can_train_save_and_load _____________________________________________________________________________________________________
self = <allennlp.tests.models.atis_semantic_parser_test.AtisSemanticParserTest testMethod=test_atis_model_can_train_save_and_load>
def test_atis_model_can_train_save_and_load(self):
> self.ensure_model_can_train_save_and_load(self.param_file)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
allennlp/common/testing/ in ensure_model_can_train_save_and_load
self.check_model_computes_gradients_correctly(model, model_batch, gradients_to_ignore)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
model = AtisSemanticParser(
(_utterance_embedder): BasicTextFieldEmbedder(
(token_embedder_tokens): Embedding()
(_...s=160, out_features=10, bias=True)
(_decoder_cell): LSTM(80, 80, num_layers=2)
(_dropout): Dropout(p=0.5)
model_batch = {'actions': [[ProductionRule(rule='agg -> [agg_func, "(", col, ")"]', is_global_rule=True, rule_id=tensor([0]), nonter... [[1174],
[ 769],
[ 302],
[ 335],
[ 432]]]), ...}
params_to_ignore = None
def check_model_computes_gradients_correctly(model: Model,
model_batch: Dict[str, Union[Any, Dict[str, Any]]],
params_to_ignore: Set[str] = None):
print("Checking gradients")
result = model(**model_batch)
has_zero_or_none_grads = {}
for name, parameter in model.named_parameters():
zeros = torch.zeros(parameter.size())
if params_to_ignore and name in params_to_ignore:
if parameter.requires_grad:
if parameter.grad is None:
has_zero_or_none_grads[name] = "No gradient computed (i.e parameter.grad is None)"
elif parameter.grad.is_sparse or
# Some parameters will only be partially updated,
# like embeddings, so we just check that any gradient is non-zero.
elif (parameter.grad.cpu() == zeros).all():
has_zero_or_none_grads[name] = f"zeros with shape ({tuple(parameter.grad.size())})"
assert parameter.grad is None
if has_zero_or_none_grads:
for name, grad in has_zero_or_none_grads.items():
print(f"Parameter: {name} had incorrect gradient: {grad}")
> raise Exception("Incorrect gradients found. See stdout for more info.")
E Exception: Incorrect gradients found. See stdout for more info.
allennlp/common/testing/ Exception
------------------------------------------------------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------------------------------------------------------
Checking gradients
Parameter: _entity_type_decoder_embedding.weight had incorrect gradient: No gradient computed (i.e parameter.grad is None)
------------------------------------------------------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------------------------------------------------------
13it [00:00, 26.42it/s]
100%|██████████| 13/13 [00:00<00:00, 2905.11it/s]
0it [00:00, ?it/s]
13it [00:00, 25.48it/s]
0it [00:00, ?it/s]
13it [00:00, 38.07it/s]
0it [00:00, ?it/s]
26it [00:00, 3036.30it/s]
0%| | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 210.5952 ||: 100%|##########| 4/4 [00:03<00:00, 1.04it/s]
0%| | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 156.3982 ||: 100%|##########| 4/4 [00:01<00:00, 2.31it/s]
0%| | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 160.4649 ||: 100%|##########| 4/4 [00:03<00:00, 1.06it/s]
0%| | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 164.9378 ||: 100%|##########| 4/4 [00:01<00:00, 2.30it/s]
0it [00:00, ?it/s]
13it [00:00, 25.30it/s]
0it [00:00, ?it/s]
13it [00:00, 25.96it/s]
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:00:53 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:00:53 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': '', 'type': 'atis'} and extras {}
23:00:53 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': ''} and extras {}
23:00:53 - INFO - - Reading ATIS instances from dataset at : /home/michael/allennlp/allennlp/tests/fixtures/data/atis/sample.json
23:00:54 - INFO - - Fitting token dictionary from dataset.
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': '', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': < object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': '', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': < object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': < object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': < object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': < object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': < object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': < object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': < object at 0x7f2fd6c49518>}
23:00:54 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': '', 'type': 'atis'} and extras {}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': ''} and extras {}
23:00:54 - INFO - allennlp.commands.train - Using a separate dataset reader to load validation and test data.
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': '', 'keep_if_unparseable': True, 'type': 'atis'} and extras {}
23:00:54 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': '', 'keep_if_unparseable': True} and extras {}
23:00:54 - INFO - allennlp.commands.train - Reading training data from allennlp/tests/fixtures/data/atis/sample.json
23:00:54 - INFO - - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:00:54 - INFO - allennlp.commands.train - Reading validation data from allennlp/tests/fixtures/data/atis/sample.json
23:00:54 - INFO - - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:00:55 - INFO - allennlp.commands.train - From dataset instances, train, validation will be considered for vocabulary creation.
23:00:55 - INFO - - Fitting token dictionary from dataset.
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': '', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': < object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': '', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': < object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': < object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': < object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': < object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': < object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': < object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': < object at 0x7f2f6dbbc278>}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:00:55 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'batch_size': 4} and extras {}
23:00:55 - INFO - allennlp.commands.train - Following parameters are Frozen (without gradient):
23:00:55 - INFO - allennlp.commands.train - Following parameters are Tunable (with gradient):
23:00:55 - INFO - allennlp.commands.train - _first_action_embedding
23:00:55 - INFO - allennlp.commands.train - _first_attended_utterance
23:00:55 - INFO - allennlp.commands.train - _utterance_embedder.token_embedder_tokens.weight
23:00:55 - INFO - allennlp.commands.train - _encoder._module.weight_ih_l0
23:00:55 - INFO - allennlp.commands.train - _encoder._module.weight_hh_l0
23:00:55 - INFO - allennlp.commands.train - _encoder._module.bias_ih_l0
23:00:55 - INFO - allennlp.commands.train - _encoder._module.bias_hh_l0
23:00:55 - INFO - allennlp.commands.train - _encoder._module.weight_ih_l0_reverse
23:00:55 - INFO - allennlp.commands.train - _encoder._module.weight_hh_l0_reverse
23:00:55 - INFO - allennlp.commands.train - _encoder._module.bias_ih_l0_reverse
23:00:55 - INFO - allennlp.commands.train - _encoder._module.bias_hh_l0_reverse
23:00:55 - INFO - allennlp.commands.train - _action_embedder.weight
23:00:55 - INFO - allennlp.commands.train - _output_action_embedder.weight
23:00:55 - INFO - allennlp.commands.train - _entity_type_decoder_embedding.weight
23:00:55 - INFO - allennlp.commands.train - _transition_function._input_projection_layer.weight
23:00:55 - INFO - allennlp.commands.train - _transition_function._input_projection_layer.bias
23:00:55 - INFO - allennlp.commands.train - _transition_function._output_projection_layer.weight
23:00:55 - INFO - allennlp.commands.train - _transition_function._output_projection_layer.bias
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_ih_l0
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_hh_l0
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_ih_l0
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_hh_l0
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_ih_l1
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_hh_l1
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_ih_l1
23:00:55 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_hh_l1
23:00:55 - DEBUG - allennlp.common.registrable - instantiating registered subclass default of <class ''>
23:00:55 - INFO - - Number of trainable parameters: 139600
23:00:55 - DEBUG - allennlp.common.registrable - instantiating registered subclass sgd of <class ''>
23:00:55 - INFO - - Beginning training.
23:00:55 - INFO - - Epoch 0/1
23:00:55 - INFO - - Peak CPU memory usage MB: 601.524
23:00:55 - INFO - - Training
23:00:55 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 287}}
23:00:55 - DEBUG - - Batch size: 4
23:00:56 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 110}}
23:00:56 - DEBUG - - Batch size: 4
23:00:57 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:00:57 - DEBUG - - Batch size: 4
23:00:58 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:00:58 - DEBUG - - Batch size: 1
23:00:59 - INFO - - Validating
23:00:59 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:00:59 - DEBUG - - Batch size: 4
23:00:59 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:00:59 - DEBUG - - Batch size: 4
23:01:00 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:00 - DEBUG - - Batch size: 4
23:01:00 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:00 - DEBUG - - Batch size: 1
23:01:01 - INFO - - Training | Validation
23:01:01 - INFO - - denotation_acc | 0.000 | 0.000
23:01:01 - INFO - - valid_sql_query | 0.000 | 0.000
23:01:01 - INFO - - loss | 210.595 | 156.398
23:01:01 - INFO - - action_similarity | 0.000 | 0.000
23:01:01 - INFO - - exact_match | 0.000 | 0.000
23:01:01 - INFO - - Best validation performance so far. Copying weights to '/tmp/allennlp_testsjckp_bka/save_and_load_test/'.
23:01:01 - INFO - - Epoch duration: 00:00:05
23:01:01 - INFO - - Estimated training time remaining: 0:00:05
23:01:01 - INFO - - Epoch 1/1
23:01:01 - INFO - - Peak CPU memory usage MB: 638.16
23:01:01 - INFO - - Training
23:01:01 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 202}}
23:01:01 - DEBUG - - Batch size: 4
23:01:02 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 287}}
23:01:02 - DEBUG - - Batch size: 4
23:01:03 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:03 - DEBUG - - Batch size: 4
23:01:04 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:04 - DEBUG - - Batch size: 1
23:01:04 - INFO - - Validating
23:01:04 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:04 - DEBUG - - Batch size: 4
23:01:05 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:05 - DEBUG - - Batch size: 4
23:01:06 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:06 - DEBUG - - Batch size: 4
23:01:06 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:06 - DEBUG - - Batch size: 1
23:01:06 - INFO - - Training | Validation
23:01:06 - INFO - - denotation_acc | 0.000 | 0.000
23:01:06 - INFO - - valid_sql_query | 0.000 | 0.000
23:01:06 - INFO - - loss | 160.465 | 164.938
23:01:06 - INFO - - action_similarity | 0.000 | 0.000
23:01:06 - INFO - - exact_match | 0.000 | 0.000
23:01:06 - INFO - - Epoch duration: 00:00:05
23:01:06 - INFO - allennlp.models.archival - archiving weights and vocabulary to /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz
23:01:06 - INFO - allennlp.commands.train - Loading the best epoch weights.
23:01:06 - INFO - allennlp.common.util - Metrics: {
"training_duration": "00:00:11",
"training_start_epoch": 0,
"training_epochs": 1,
"epoch": 1,
"training_exact_match": 0,
"training_denotation_acc": 0,
"training_valid_sql_query": 0,
"training_action_similarity": 0,
"training_loss": 160.4649143218994,
"validation_exact_match": 0.0,
"validation_denotation_acc": 0.0,
"validation_valid_sql_query": 0.0,
"validation_action_similarity": 0.0,
"validation_loss": 164.93781661987305,
"best_epoch": 0,
"best_validation_exact_match": 0.0,
"best_validation_denotation_acc": 0.0,
"best_validation_valid_sql_query": 0.0,
"best_validation_action_similarity": 0.0,
"best_validation_loss": 156.3982391357422
23:01:06 - INFO - allennlp.models.archival - loading archive file /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz from cache at /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz
23:01:06 - INFO - allennlp.models.archival - extracting archive file /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz to temp dir /tmp/tmp2o3wao_0
23:01:06 - DEBUG - allennlp.common.registrable - instantiating registered subclass atis_parser of <class 'allennlp.models.model.Model'>
23:01:06 - DEBUG - allennlp.common.registrable - instantiating registered subclass default of <class ''>
23:01:06 - INFO - - Loading token dictionary from /tmp/tmp2o3wao_0/vocabulary.
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': '', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': < object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': '', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': < object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': < object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': < object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': < object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': < object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': < object at 0x7f2f6e2a4978>}
23:01:06 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': < object at 0x7f2f6e2a4978>}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': '', 'type': 'atis'} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': ''} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'batch_size': 4} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:07 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'batch_size': 4} and extras {}
23:01:07 - INFO - - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:07 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:07 - DEBUG - - Batch size: 4
23:01:07 - INFO - - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:08 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:08 - DEBUG - - Batch size: 4
------------------------------------------------------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------------------------------------------------------
Checking gradients
Parameter: _entity_type_decoder_embedding.weight had incorrect gradient: No gradient computed (i.e parameter.grad is None)
------------------------------------------------------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------------------------------------------------------
13it [00:00, 36.74it/s]
100%|██████████| 13/13 [00:00<00:00, 2902.32it/s]
0it [00:00, ?it/s]
13it [00:00, 23.84it/s]
0it [00:00, ?it/s]
13it [00:00, 37.47it/s]
0it [00:00, ?it/s]
26it [00:00, 2951.98it/s]
0%| | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 210.5952 ||: 100%|##########| 4/4 [00:03<00:00, 1.07it/s]
0%| | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 156.3982 ||: 100%|##########| 4/4 [00:01<00:00, 2.09it/s]
0%| | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 160.4649 ||: 100%|##########| 4/4 [00:04<00:00, 1.04s/it]
0%| | 0/4 [00:00<?, ?it/s]
exact_match: 0.0000, denotation_acc: 0.0000, valid_sql_query: 0.0000, action_similarity: 0.0000, loss: 164.9378 ||: 100%|##########| 4/4 [00:01<00:00, 2.09it/s]
0it [00:00, ?it/s]
13it [00:00, 39.00it/s]
0it [00:00, ?it/s]
13it [00:00, 24.27it/s]
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:01:09 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:01:09 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': '', 'type': 'atis'} and extras {}
23:01:09 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': ''} and extras {}
23:01:09 - INFO - - Reading ATIS instances from dataset at : /home/michael/allennlp/allennlp/tests/fixtures/data/atis/sample.json
23:01:10 - INFO - - Fitting token dictionary from dataset.
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': '', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': < object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': '', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': < object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': < object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': < object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': < object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': < object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': < object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': < object at 0x7f2f6bf469e8>}
23:01:10 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': '', 'type': 'atis'} and extras {}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': ''} and extras {}
23:01:10 - INFO - allennlp.commands.train - Using a separate dataset reader to load validation and test data.
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': '', 'keep_if_unparseable': True, 'type': 'atis'} and extras {}
23:01:10 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': '', 'keep_if_unparseable': True} and extras {}
23:01:10 - INFO - allennlp.commands.train - Reading training data from allennlp/tests/fixtures/data/atis/sample.json
23:01:10 - INFO - - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:10 - INFO - allennlp.commands.train - Reading validation data from allennlp/tests/fixtures/data/atis/sample.json
23:01:11 - INFO - - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:11 - INFO - allennlp.commands.train - From dataset instances, train, validation will be considered for vocabulary creation.
23:01:11 - INFO - - Fitting token dictionary from dataset.
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': '', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': < object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': '', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': < object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': < object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': < object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': < object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': < object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': < object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': < object at 0x7f2f6a9e1fd0>}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:11 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'batch_size': 4} and extras {}
23:01:11 - INFO - allennlp.commands.train - Following parameters are Frozen (without gradient):
23:01:11 - INFO - allennlp.commands.train - Following parameters are Tunable (with gradient):
23:01:11 - INFO - allennlp.commands.train - _first_action_embedding
23:01:11 - INFO - allennlp.commands.train - _first_attended_utterance
23:01:11 - INFO - allennlp.commands.train - _utterance_embedder.token_embedder_tokens.weight
23:01:11 - INFO - allennlp.commands.train - _encoder._module.weight_ih_l0
23:01:11 - INFO - allennlp.commands.train - _encoder._module.weight_hh_l0
23:01:11 - INFO - allennlp.commands.train - _encoder._module.bias_ih_l0
23:01:11 - INFO - allennlp.commands.train - _encoder._module.bias_hh_l0
23:01:11 - INFO - allennlp.commands.train - _encoder._module.weight_ih_l0_reverse
23:01:11 - INFO - allennlp.commands.train - _encoder._module.weight_hh_l0_reverse
23:01:11 - INFO - allennlp.commands.train - _encoder._module.bias_ih_l0_reverse
23:01:11 - INFO - allennlp.commands.train - _encoder._module.bias_hh_l0_reverse
23:01:11 - INFO - allennlp.commands.train - _action_embedder.weight
23:01:11 - INFO - allennlp.commands.train - _output_action_embedder.weight
23:01:11 - INFO - allennlp.commands.train - _entity_type_decoder_embedding.weight
23:01:11 - INFO - allennlp.commands.train - _transition_function._input_projection_layer.weight
23:01:11 - INFO - allennlp.commands.train - _transition_function._input_projection_layer.bias
23:01:11 - INFO - allennlp.commands.train - _transition_function._output_projection_layer.weight
23:01:11 - INFO - allennlp.commands.train - _transition_function._output_projection_layer.bias
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_ih_l0
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_hh_l0
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_ih_l0
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_hh_l0
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_ih_l1
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.weight_hh_l1
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_ih_l1
23:01:11 - INFO - allennlp.commands.train - _transition_function._decoder_cell.bias_hh_l1
23:01:11 - DEBUG - allennlp.common.registrable - instantiating registered subclass default of <class ''>
23:01:11 - INFO - - Number of trainable parameters: 139600
23:01:11 - DEBUG - allennlp.common.registrable - instantiating registered subclass sgd of <class ''>
23:01:11 - INFO - - Beginning training.
23:01:11 - INFO - - Epoch 0/1
23:01:11 - INFO - - Peak CPU memory usage MB: 684.044
23:01:11 - INFO - - Training
23:01:11 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 287}}
23:01:11 - DEBUG - - Batch size: 4
23:01:13 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 110}}
23:01:13 - DEBUG - - Batch size: 4
23:01:13 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:13 - DEBUG - - Batch size: 4
23:01:14 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:14 - DEBUG - - Batch size: 1
23:01:15 - INFO - - Validating
23:01:15 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:15 - DEBUG - - Batch size: 4
23:01:16 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:16 - DEBUG - - Batch size: 4
23:01:16 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:16 - DEBUG - - Batch size: 4
23:01:17 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:17 - DEBUG - - Batch size: 1
23:01:17 - INFO - - Training | Validation
23:01:17 - INFO - - denotation_acc | 0.000 | 0.000
23:01:17 - INFO - - valid_sql_query | 0.000 | 0.000
23:01:17 - INFO - - loss | 210.595 | 156.398
23:01:17 - INFO - - action_similarity | 0.000 | 0.000
23:01:17 - INFO - - exact_match | 0.000 | 0.000
23:01:17 - INFO - - Best validation performance so far. Copying weights to '/tmp/allennlp_testsjckp_bka/save_and_load_test/'.
23:01:17 - INFO - - Epoch duration: 00:00:05
23:01:17 - INFO - - Estimated training time remaining: 0:00:05
23:01:17 - INFO - - Epoch 1/1
23:01:17 - INFO - - Peak CPU memory usage MB: 698.0
23:01:17 - INFO - - Training
23:01:17 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 202}}
23:01:17 - DEBUG - - Batch size: 4
23:01:18 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 287}}
23:01:18 - DEBUG - - Batch size: 4
23:01:20 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:20 - DEBUG - - Batch size: 4
23:01:21 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:21 - DEBUG - - Batch size: 1
23:01:21 - INFO - - Validating
23:01:21 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:21 - DEBUG - - Batch size: 4
23:01:22 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 15}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 15}, 'target_action_sequence': {'num_fields': 203}}
23:01:22 - DEBUG - - Batch size: 4
23:01:22 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 11}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 11}, 'target_action_sequence': {'num_fields': 101}}
23:01:22 - DEBUG - - Batch size: 4
23:01:23 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 6}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 6}, 'target_action_sequence': {'num_fields': 191}}
23:01:23 - DEBUG - - Batch size: 1
23:01:23 - INFO - - Training | Validation
23:01:23 - INFO - - denotation_acc | 0.000 | 0.000
23:01:23 - INFO - - valid_sql_query | 0.000 | 0.000
23:01:23 - INFO - - loss | 160.465 | 164.938
23:01:23 - INFO - - action_similarity | 0.000 | 0.000
23:01:23 - INFO - - exact_match | 0.000 | 0.000
23:01:23 - INFO - - Epoch duration: 00:00:06
23:01:23 - INFO - allennlp.models.archival - archiving weights and vocabulary to /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz
23:01:23 - INFO - allennlp.commands.train - Loading the best epoch weights.
23:01:23 - INFO - allennlp.common.util - Metrics: {
"training_duration": "00:00:11",
"training_start_epoch": 0,
"training_epochs": 1,
"epoch": 1,
"training_exact_match": 0,
"training_denotation_acc": 0,
"training_valid_sql_query": 0,
"training_action_similarity": 0,
"training_loss": 160.4649257659912,
"validation_exact_match": 0.0,
"validation_denotation_acc": 0.0,
"validation_valid_sql_query": 0.0,
"validation_action_similarity": 0.0,
"validation_loss": 164.93781661987305,
"best_epoch": 0,
"best_validation_exact_match": 0.0,
"best_validation_denotation_acc": 0.0,
"best_validation_valid_sql_query": 0.0,
"best_validation_action_similarity": 0.0,
"best_validation_loss": 156.39822578430176
23:01:23 - INFO - allennlp.models.archival - loading archive file /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz from cache at /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz
23:01:23 - INFO - allennlp.models.archival - extracting archive file /tmp/allennlp_testsjckp_bka/save_and_load_test/model.tar.gz to temp dir /tmp/tmp_ax2l5vy
23:01:23 - DEBUG - allennlp.common.registrable - instantiating registered subclass atis_parser of <class 'allennlp.models.model.Model'>
23:01:23 - DEBUG - allennlp.common.registrable - instantiating registered subclass default of <class ''>
23:01:23 - INFO - - Loading token dictionary from /tmp/tmp_ax2l5vy/vocabulary.
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'action_embedding_dim': 10, 'database_file': '', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'type': 'atis_parser', 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': < object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.semantic_parsing.atis.atis_semantic_parser.AtisSemanticParser'> from params {'action_embedding_dim': 10, 'database_file': '', 'decoder_beam_search': {'beam_size': 5}, 'decoder_num_layers': 2, 'dropout': 0.5, 'encoder': {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'}, 'input_attention': {'type': 'dot_product'}, 'max_decoding_steps': 10, 'utterance_embedder': {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': < object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': < object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 20, 'trainable': True, 'type': 'embedding'} and extras {'vocab': < object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'hidden_size': 40, 'input_size': 20, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': < object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.state_machines.beam_search.BeamSearch'> from params {'beam_size': 5} and extras {'vocab': < object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.attention.Attention'> from params {'type': 'dot_product'} and extras {'vocab': < object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.attention.dot_product_attention.DotProductAttention'> from params {} and extras {'vocab': < object at 0x7f2f6b23da20>}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': '', 'type': 'atis'} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'database_file': ''} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'batch_size': 4} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'batch_size': 4, 'type': 'basic'} and extras {}
23:01:23 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'batch_size': 4} and extras {}
23:01:23 - INFO - - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:24 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:24 - DEBUG - - Batch size: 4
23:01:24 - INFO - - Reading ATIS instances from dataset at : allennlp/tests/fixtures/data/atis/sample.json
23:01:24 - DEBUG - - Batch padding lengths: {'utterance': {'num_tokens': 13}, 'actions': {'num_fields': 1205}, 'linking_scores': {'dimension_0': 906, 'dimension_1': 13}, 'target_action_sequence': {'num_fields': 287}}
23:01:24 - DEBUG - - Batch size: 4
_____________________________________________________________________________________ BiaffineDependencyParserTest.test_mst_decodes_arc_labels_with_respect_to_unconstrained_scores _____________________________________________________________________________________
self = <allennlp.tests.models.biaffine_dependency_parser_test.BiaffineDependencyParserTest testMethod=test_mst_decodes_arc_labels_with_respect_to_unconstrained_scores>
def test_mst_decodes_arc_labels_with_respect_to_unconstrained_scores(self):
energy = torch.Tensor([[0, 2, 1],
[10, 0, 0.5],
[9, 0.2, 0]]).view(1, 1, 3, 3).expand(1, 2, 3, 3).contiguous()
# Make the score for the root label for arcs to the root token be higher - it
# will be masked for the MST, but we want to make sure that the tags are with
# respect to the unmasked tensor. If the masking was incorrect, we would decode all
# zeros as the labels, because torch takes the first index in the case that all the
# values are equal, which would be the case if the labels were calculated from
# the masked score.
energy[:, 1, 0, :] = 3
length = torch.LongTensor([3])
heads, tags = self.model._run_mst_decoding(energy, length) # pylint: disable=protected-access
assert heads.tolist()[0] == [0, 0, 1]
> assert tags.tolist()[0] == [0, 1, 0]
E AssertionError: assert [0, 1, 1] == [0, 1, 0]
E At index 2 diff: 1 != 0
E Use -v to get the full diff
allennlp/tests/models/ AssertionError
------------------------------------------------------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------------------------------------------------------
4it [00:00, 1511.05it/s]
100%|██████████| 4/4 [00:00<00:00, 28876.45it/s]
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:01:27 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {'type': 'universal_dependencies'} and extras {}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class ''> from params {} and extras {}
23:01:27 - INFO - - Reading UD instances from conllu dataset at: /home/michael/allennlp/allennlp/tests/fixtures/data/dependencies.conllu
23:01:27 - INFO - - Fitting token dictionary from dataset.
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'arc_representation_dim': 3, 'encoder': {'hidden_size': 4, 'input_size': 2, 'num_layers': 1, 'type': 'lstm'}, 'tag_representation_dim': 3, 'text_field_embedder': {'tokens': {'embedding_dim': 2, 'trainable': True, 'type': 'embedding'}}, 'type': 'biaffine_parser'} and extras {'vocab': < object at 0x7f2f6e342b70>}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.biaffine_dependency_parser.BiaffineDependencyParser'> from params {'arc_representation_dim': 3, 'encoder': {'hidden_size': 4, 'input_size': 2, 'num_layers': 1, 'type': 'lstm'}, 'tag_representation_dim': 3, 'text_field_embedder': {'tokens': {'embedding_dim': 2, 'trainable': True, 'type': 'embedding'}}} and extras {'vocab': < object at 0x7f2f6e342b70>}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'embedding_dim': 2, 'trainable': True, 'type': 'embedding'}} and extras {'vocab': < object at 0x7f2f6e342b70>}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 2, 'trainable': True, 'type': 'embedding'} and extras {'vocab': < object at 0x7f2f6e342b70>}
23:01:27 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'hidden_size': 4, 'input_size': 2, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': < object at 0x7f2f6e342b70>}
23:01:27 - DEBUG - allennlp.common.registrable - instantiating registered subclass elu of <class 'allennlp.nn.activations.Activation'>
23:01:27 - DEBUG - allennlp.common.registrable - instantiating registered subclass linear of <class 'allennlp.nn.activations.Activation'>
23:01:27 - DEBUG - allennlp.common.registrable - instantiating registered subclass elu of <class 'allennlp.nn.activations.Activation'>
23:01:27 - INFO - allennlp.models.biaffine_dependency_parser - Found POS tags correspoding to the following punctuation : {'PUNCT': 3}. Ignoring words with these POS tags for evaluation.
______________________________________________________________________________________________ TestNnUtil.test_sequence_cross_entropy_with_logits_averages_batch_correctly ______________________________________________________________________________________________
self = <allennlp.tests.nn.util_test.TestNnUtil testMethod=test_sequence_cross_entropy_with_logits_averages_batch_correctly>
def test_sequence_cross_entropy_with_logits_averages_batch_correctly(self):
# test batch average is the same as dividing the batch averaged
# loss by the number of batches containing any non-padded tokens.
tensor = torch.rand([5, 7, 4])
tensor[0, 3:, :] = 0
tensor[1, 4:, :] = 0
tensor[2, 2:, :] = 0
tensor[3, :, :] = 0
weights = (tensor != 0.0)[:, :, 0].long().squeeze(-1)
targets = torch.LongTensor(numpy.random.randint(0, 3, [5, 7]))
targets *= weights
loss = util.sequence_cross_entropy_with_logits(tensor, targets, weights)
vector_loss = util.sequence_cross_entropy_with_logits(tensor, targets, weights, average=None)
# Batch has one completely padded row, so divide by 4.
> assert == / 4
E TypeError: eq() received an invalid combination of arguments - got (numpy.ndarray), but expected one of:
E * (Tensor other)
E didn't match because some of the arguments have invalid types: (!numpy.ndarray!)
E * (Number other)
E didn't match because some of the arguments have invalid types: (!numpy.ndarray!)
allennlp/tests/nn/ TypeError
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:21:14 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
______________________________________________________________________________________________ TestNnUtil.test_sequence_cross_entropy_with_logits_averages_token_correctly ______________________________________________________________________________________________
self = <allennlp.tests.nn.util_test.TestNnUtil testMethod=test_sequence_cross_entropy_with_logits_averages_token_correctly>
def test_sequence_cross_entropy_with_logits_averages_token_correctly(self):
# test token average is the same as multiplying the per-batch loss
# with the per-batch weights and dividing by the total weight
tensor = torch.rand([5, 7, 4])
tensor[0, 3:, :] = 0
tensor[1, 4:, :] = 0
tensor[2, 2:, :] = 0
tensor[3, :, :] = 0
weights = (tensor != 0.0)[:, :, 0].long().squeeze(-1)
targets = torch.LongTensor(numpy.random.randint(0, 3, [5, 7]))
targets *= weights
loss = util.sequence_cross_entropy_with_logits(tensor, targets, weights, average="token")
vector_loss = util.sequence_cross_entropy_with_logits(tensor, targets, weights, batch_average=False)
total_token_loss = (vector_loss * weights.float().sum(dim=-1)).sum()
average_token_loss = (total_token_loss / weights.float().sum()).detach()
> assert_almost_equal(loss.detach()[0], average_token_loss[0])
E IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number
allennlp/tests/nn/ IndexError
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:21:14 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
____________________________________________________________________________________________________________________ TestNnUtil.test_viterbi_decode _____________________________________________________________________________________________________________________
self = <allennlp.tests.nn.util_test.TestNnUtil testMethod=test_viterbi_decode>
def test_viterbi_decode(self):
# Test Viterbi decoding is equal to greedy decoding with no pairwise potentials.
sequence_logits = torch.nn.functional.softmax(torch.rand([5, 9]), dim=-1)
transition_matrix = torch.zeros([9, 9])
indices, _ = util.viterbi_decode(, transition_matrix)
_, argmax_indices = torch.max(sequence_logits, 1)
assert indices ==
# Test that pairwise potentials effect the sequence correctly and that
# viterbi_decode can handle -inf values.
sequence_logits = torch.FloatTensor([[0, 0, 0, 3, 4],
[0, 0, 0, 3, 4],
[0, 0, 0, 3, 4],
[0, 0, 0, 3, 4],
[0, 0, 0, 3, 4],
[0, 0, 0, 3, 4]])
# The same tags shouldn't appear sequentially.
transition_matrix = torch.zeros([5, 5])
for i in range(5):
transition_matrix[i, i] = float("-inf")
indices, _ = util.viterbi_decode(sequence_logits, transition_matrix)
> assert indices == [4, 3, 4, 3, 4, 3]
E AssertionError: assert [3, 4, 3, 4, 3, 4] == [4, 3, 4, 3, 4, 3]
E At index 0 diff: 3 != 4
E Use -v to get the full diff
allennlp/tests/nn/ AssertionError
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:21:14 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
_______________________________________________________________________________________________ TestDenseSparseAdam.test_can_optimise_model_with_dense_and_sparse_params ________________________________________________________________________________________________
self = < testMethod=test_can_optimise_model_with_dense_and_sparse_params>
def test_can_optimise_model_with_dense_and_sparse_params(self):
optimizer_params = Params({
"type": "dense_sparse_adam"
parameters = [[n, p] for n, p in self.model.named_parameters() if p.requires_grad]
optimizer = Optimizer.from_params(parameters, optimizer_params)
iterator = BasicIterator(2)
> Trainer(self.model, optimizer, iterator, self.instances).train()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
allennlp/training/ in train
train_metrics = self._train_epoch(epoch)
allennlp/training/ in _train_epoch
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = DenseSparseAdam (
Parameter Group 0
betas: (0.9, 0.999)
eps: 1e-08
lr: 0.001
), closure = None
def step(self, closure=None):
Performs a single optimization step.
closure : ``callable``, optional.
A closure that reevaluates the model and returns the loss.
loss = None
if closure is not None:
loss = closure()
for group in self.param_groups:
for p in group['params']:
if p.grad is None:
grad =
state = self.state[p]
# State initialization
if len(state) == 0:
state['step'] = 0
# Exponential moving average of gradient values
state['exp_avg'] = torch.zeros_like(
# Exponential moving average of squared gradient values
state['exp_avg_sq'] = torch.zeros_like(
state['step'] += 1
exp_avg, exp_avg_sq = state['exp_avg'], state['exp_avg_sq']
beta1, beta2 = group['betas']
if grad.is_sparse:
grad = grad.coalesce() # the update is non-linear so indices must be unique
grad_indices = grad._indices()
grad_values = grad._values()
size = grad.size()
def make_sparse(values):
constructor =
if grad_indices.dim() == 0 or values.dim() == 0:
return constructor().resize_as_(grad)
return constructor(grad_indices, values, size)
# Decay the first and second moment running average coefficient
# old <- b * old + (1 - b) * new
# <==> old += (1 - b) * (new - old)
> old_exp_avg_values = exp_avg._sparse_mask(grad)._values()
E AttributeError: 'Tensor' object has no attribute '_sparse_mask'
allennlp/training/ AttributeError
------------------------------------------------------------------------------------------------------------------------- Captured stderr call --------------------------------------------------------------------------------------------------------------------------
4it [00:00, 1393.92it/s]
100%|██████████| 4/4 [00:00<00:00, 49784.02it/s]
0%| | 0/2 [00:00<?, ?it/s]
--------------------------------------------------------------------------------------------------------------------------- Captured log call ---------------------------------------------------------------------------------------------------------------------------
23:21:35 - INFO - allennlp.common.checks - Pytorch version: 1.0.0.dev20181108
23:21:35 - INFO - - Reading instances from lines in file at: /home/michael/allennlp/allennlp/tests/fixtures/data/sequence_tagging.tsv
23:21:35 - INFO - - Fitting token dictionary from dataset.
23:21:35 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.simple_tagger.SimpleTagger'> from params {'text_field_embedder': {'tokens': {'type': 'embedding', 'embedding_dim': 5, 'sparse': True}}, 'encoder': {'type': 'lstm', 'input_size': 5, 'hidden_size': 7, 'num_layers': 2}} and extras {'vocab': < object at 0x7f2f3bc45f60>}
23:21:35 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'tokens': {'type': 'embedding', 'embedding_dim': 5, 'sparse': True}} and extras {'vocab': < object at 0x7f2f3bc45f60>}
23:21:35 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'type': 'embedding', 'embedding_dim': 5, 'sparse': True} and extras {'vocab': < object at 0x7f2f3bc45f60>}
23:21:35 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'type': 'lstm', 'input_size': 5, 'hidden_size': 7, 'num_layers': 2} and extras {'vocab': < object at 0x7f2f3bc45f60>}
23:21:35 - INFO - - Number of trainable parameters: 901
23:21:35 - DEBUG - allennlp.common.registrable - instantiating registered subclass dense_sparse_adam of <class ''>
23:21:35 - INFO - - Beginning training.
23:21:35 - INFO - - Epoch 0/19
23:21:35 - INFO - - Peak CPU memory usage MB: 1657.16
23:21:35 - INFO - - Training
23:21:35 - DEBUG - - Batch padding lengths: {'tokens': {'num_tokens': 4}, 'tags': {'num_tokens': 4}}
23:21:35 - DEBUG - - Batch size: 2
PyTorch 1.0 should be released soon (at the PyTorch dev conference in early October)?
it's almost certain that it will require changes to the AllenNLP library, just as Pytorch 0.3 -> Pytorch 0.4 did.
This is just a placeholder so that we keep track of that this work will have to be done in Q4
Success Criteria
a new version (0.7? 1.0?) of AllenNLP that works with PyTorch 1.0