BiMPM model implementation

hzeng-otterai commented 6 years ago

BiMPM algorithm (https://arxiv.org/pdf/1702.03814) is useful for paraphrasing and inference tasks. I currently have an implementation of BiMPM in allennlp and I have tested it for a while. It is also interesting to see how it compare to ESIM. If anyone is interested I can refine my code and provide a PR.

matt-gardner commented 6 years ago

Yeah, that'd be great, please submit a PR!

schmmd commented 6 years ago

@handsomezebra we'd very much appreciate a PR. But there is some potential overhead to getting a PR merged down into AllenNLP. At the very lead we would like to link out to you model from https://allennlp.org/models. Do you have a public GitHub repo where it's working currently?

hzeng-otterai commented 6 years ago

Yes I currently have the model which can reproduce the paper's results on the Quora dataset but I will need some more time to integrate it with elmo. Then I will send you guys the links to trained model after it is ready. Thanks!

hzeng-otterai commented 6 years ago

Hi, I tried to integrate the BiMPM model with Elmo, but couldn't get better accuracy. I only got one 1080Ti/12GB machine so not able to run a lot of experiments. So I am going to post my code and model without Elmo here and see if anyone can help me get improvement with Elmo.

My code is hosted at https://github.com/handsomezebra/nlp. I trained BiMPM models on two datasets Quora Paraphrase and SNLI. The evaluation results are:

Quora: 89.01% accuracy (using experiments/quora_bimpm_word_char.json as config file)
SNLI: 87.11% accuracy (using experiments/snli_bimpm_word_char.json as config file).

To evaluate my pretrained model, first clone the code and cd into the cloned directory, and then run the following commands:

allennlp evaluate https://s3-us-west-1.amazonaws.com/handsomezebra/public/quora_bimpm.tar.gz --evaluation-data-file \(https://s3-us-west-1.amazonaws.com/handsomezebra/public/Quora_question_pair_partition.zip\)#Quora_question_pair_partition/test.tsv --include-package hznlp --cuda-device 0

allennlp evaluate https://s3-us-west-1.amazonaws.com/handsomezebra/public/snli_bimpm.tar.gz --evaluation-data-file https://s3-us-west-2.amazonaws.com/allennlp/datasets/snli/snli_1.0_test.jsonl --include-package hznlp --cuda-device 0

Next I am going to provide an PR for the BiMPM if it's ok.

Thanks!

schmmd commented 6 years ago

@handsomezebra thanks for the pointers! I'll try to recreate your results this week, and once I do I will add your model to https://allennlp.org/models.

schmmd commented 6 years ago

@handsomezebra we'd love a PR. Are you one of the authors of https://arxiv.org/pdf/1702.03814?

hzeng-otterai commented 6 years ago

No, I am not. I think it's good to let the authors to review and comment once the PR's ready.

kalyangvs commented 6 years ago

Hi, I too tried to integrate the BiMPM (SNLI task) model with Elmo but couldn't even replicate the Paper's scores

when used the Elmo class , the accuracy is way below and training is too slow so couldn't try experimentation. (didn't soft tune )
when used the elmo embedder command and wrote the static embeddings (3 representations for each word thats like GB s of files)to files, Scalar Mix them (linear layer learns weights Or I suppose even an average is sufficient) and use these to train the model . But the results didn't improve.

So could you specify the hyperparameters required by the model Or where I could have gone wrong.

note: the implemenation of model is from https://github.com/galsang/BIMPM-pytorch , which is Standard Pytorch implementation that did replicate the scores of paper

hzeng-otterai commented 6 years ago

@gvskalyan I didn't try BiMPM + Elmo on SNLI, but based on my several experiments of BiMPM + Elmo on Quora, I can get accuracy between 87-88% (my best accuracy on Quora without Elmo is 89.01%). The GPU memory consumption is high because both BiMPM and Elmo use a lot of memory so I have to reduce the batch size to 16 or lower. If you can post your code somewhere I can help to take a look.

schmmd commented 6 years ago

I tried repeating these commands today and confirmed the improvement on SNLI:

$ pip install git+git://github.com/allenai/allennlp.git@bf760b040d913674591613e3e6faa2cb859121c6

$ allennlp evaluate https://s3-us-west-1.amazonaws.com/handsomezebra/public/snli_bimpm.tar.gz https://s3-us-west-2.amazonaws.com/allennlp/datasets/snli/snli_1.0_test.jsonl --include-package hznlp --cuda-device -1                        (allennlp-pip) 
/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192, got 176
  return f(*args, **kwds)
/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192, got 176
  return f(*args, **kwds)
2018-08-07 13:16:00,140 - INFO - allennlp.models.archival - loading archive file https://s3-us-west-1.amazonaws.com/handsomezebra/public/snli_bimpm.tar.gz from cache at /Users/michael/.allennlp/datasets/8cc03bd7659c196cc196ad07092c36a9179b982fbe9ebc0ce8c3ee3b9b7a330c.caca2156047805f9d5a03827e96b26d38bfd231348f3ca9c1275ed649eaa3fb0
2018-08-07 13:16:00,142 - INFO - allennlp.models.archival - extracting archive file /Users/michael/.allennlp/datasets/8cc03bd7659c196cc196ad07092c36a9179b982fbe9ebc0ce8c3ee3b9b7a330c.caca2156047805f9d5a03827e96b26d38bfd231348f3ca9c1275ed649eaa3fb0 to temp dir /var/folders/v5/tlk1sh3d0l94fcmz10vtzmyh0000gn/T/tmpplvg7f2b
2018-08-07 13:16:00,692 - INFO - allennlp.common.registrable - instantiating registered subclass bimpm of <class 'allennlp.models.model.Model'>
2018-08-07 13:16:00,692 - INFO - allennlp.data.vocabulary - Loading token dictionary from /var/folders/v5/tlk1sh3d0l94fcmz10vtzmyh0000gn/T/tmpplvg7f2b/vocabulary.
2018-08-07 13:16:00,729 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'aggregator': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 100, 'input_size': 264, 'num_layers': 2, 'type': 'lstm'}, 'classifier_feedforward': {'activations': ['relu', 'linear'], 'dropout': [0.1, 0], 'hidden_dims': [200, 3], 'input_dim': 400, 'num_layers': 2}, 'dropout': 0.1, 'encoder1': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 200, 'input_size': 400, 'num_layers': 1, 'type': 'lstm'}, 'encoder2': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 200, 'input_size': 400, 'num_layers': 1, 'type': 'lstm'}, 'initializer': [['.*linear_layers.*weight', {'type': 'xavier_normal'}], ['.*linear_layers.*bias', {'type': 'constant', 'val': 0}], ['.*weight_ih.*', {'type': 'xavier_normal'}], ['.*weight_hh.*', {'type': 'orthogonal'}], ['.*bias.*', {'type': 'constant', 'val': 0}], ['.*matcher.*params.*', {'type': 'kaiming_normal'}]], 'matcher_bw1': {'hidden_dim': 200, 'is_forward': False, 'num_perspective': 10}, 'matcher_bw2': {'hidden_dim': 200, 'is_forward': False, 'num_perspective': 10}, 'matcher_fw1': {'hidden_dim': 200, 'is_forward': True, 'num_perspective': 10}, 'matcher_fw2': {'hidden_dim': 200, 'is_forward': True, 'num_perspective': 10}, 'text_field_embedder': {'token_characters': {'embedding': {'embedding_dim': 20, 'padding_index': 0}, 'encoder': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 50, 'input_size': 20, 'num_layers': 1, 'type': 'gru'}, 'type': 'character_encoding'}, 'tokens': {'embedding_dim': 300, 'padding_index': 0, 'trainable': False, 'type': 'embedding'}}, 'type': 'bimpm', 'word_matcher': {'hidden_dim': 400, 'is_forward': True, 'num_perspective': 10, 'wo_full_match': True}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x10eea8be0>}
2018-08-07 13:16:00,730 - INFO - allennlp.common.from_params - instantiating class <class 'hznlp.models.bimpm.BiMPM'> from params {'aggregator': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 100, 'input_size': 264, 'num_layers': 2, 'type': 'lstm'}, 'classifier_feedforward': {'activations': ['relu', 'linear'], 'dropout': [0.1, 0], 'hidden_dims': [200, 3], 'input_dim': 400, 'num_layers': 2}, 'dropout': 0.1, 'encoder1': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 200, 'input_size': 400, 'num_layers': 1, 'type': 'lstm'}, 'encoder2': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 200, 'input_size': 400, 'num_layers': 1, 'type': 'lstm'}, 'initializer': [['.*linear_layers.*weight', {'type': 'xavier_normal'}], ['.*linear_layers.*bias', {'type': 'constant', 'val': 0}], ['.*weight_ih.*', {'type': 'xavier_normal'}], ['.*weight_hh.*', {'type': 'orthogonal'}], ['.*bias.*', {'type': 'constant', 'val': 0}], ['.*matcher.*params.*', {'type': 'kaiming_normal'}]], 'matcher_bw1': {'hidden_dim': 200, 'is_forward': False, 'num_perspective': 10}, 'matcher_bw2': {'hidden_dim': 200, 'is_forward': False, 'num_perspective': 10}, 'matcher_fw1': {'hidden_dim': 200, 'is_forward': True, 'num_perspective': 10}, 'matcher_fw2': {'hidden_dim': 200, 'is_forward': True, 'num_perspective': 10}, 'text_field_embedder': {'token_characters': {'embedding': {'embedding_dim': 20, 'padding_index': 0}, 'encoder': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 50, 'input_size': 20, 'num_layers': 1, 'type': 'gru'}, 'type': 'character_encoding'}, 'tokens': {'embedding_dim': 300, 'padding_index': 0, 'trainable': False, 'type': 'embedding'}}, 'word_matcher': {'hidden_dim': 400, 'is_forward': True, 'num_perspective': 10, 'wo_full_match': True}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x10eea8be0>}
<class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'>
2018-08-07 13:16:00,730 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'token_characters': {'embedding': {'embedding_dim': 20, 'padding_index': 0}, 'encoder': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 50, 'input_size': 20, 'num_layers': 1, 'type': 'gru'}, 'type': 'character_encoding'}, 'tokens': {'embedding_dim': 300, 'padding_index': 0, 'trainable': False, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x10eea8be0>}
2018-08-07 13:16:00,731 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding': {'embedding_dim': 20, 'padding_index': 0}, 'encoder': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 50, 'input_size': 20, 'num_layers': 1, 'type': 'gru'}, 'type': 'character_encoding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x10eea8be0>}
2018-08-07 13:16:00,732 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder'> from params {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 50, 'input_size': 20, 'num_layers': 1, 'type': 'gru'} and extras {}
/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/site-packages/torch/nn/modules/rnn.py:38: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.1 and num_layers=1
  "num_layers={}".format(dropout, num_layers))
2018-08-07 13:16:00,733 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 300, 'padding_index': 0, 'trainable': False, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x10eea8be0>}
<class 'hznlp.models.matching_layer.MatchingLayer'>
2018-08-07 13:16:00,841 - INFO - allennlp.common.from_params - instantiating class <class 'hznlp.models.matching_layer.MatchingLayer'> from params {'hidden_dim': 400, 'is_forward': True, 'num_perspective': 10, 'wo_full_match': True} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x10eea8be0>}
<class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'>
2018-08-07 13:16:00,841 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 200, 'input_size': 400, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x10eea8be0>}
<class 'hznlp.models.matching_layer.MatchingLayer'>
2018-08-07 13:16:00,850 - INFO - allennlp.common.from_params - instantiating class <class 'hznlp.models.matching_layer.MatchingLayer'> from params {'hidden_dim': 200, 'is_forward': True, 'num_perspective': 10} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x10eea8be0>}
<class 'hznlp.models.matching_layer.MatchingLayer'>
2018-08-07 13:16:00,851 - INFO - allennlp.common.from_params - instantiating class <class 'hznlp.models.matching_layer.MatchingLayer'> from params {'hidden_dim': 200, 'is_forward': False, 'num_perspective': 10} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x10eea8be0>}
<class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'>
2018-08-07 13:16:00,852 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 200, 'input_size': 400, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x10eea8be0>}
<class 'hznlp.models.matching_layer.MatchingLayer'>
2018-08-07 13:16:00,861 - INFO - allennlp.common.from_params - instantiating class <class 'hznlp.models.matching_layer.MatchingLayer'> from params {'hidden_dim': 200, 'is_forward': True, 'num_perspective': 10} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x10eea8be0>}
<class 'hznlp.models.matching_layer.MatchingLayer'>
2018-08-07 13:16:00,862 - INFO - allennlp.common.from_params - instantiating class <class 'hznlp.models.matching_layer.MatchingLayer'> from params {'hidden_dim': 200, 'is_forward': False, 'num_perspective': 10} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x10eea8be0>}
<class 'allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder'>
2018-08-07 13:16:00,862 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder'> from params {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 100, 'input_size': 264, 'num_layers': 2, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x10eea8be0>}
<class 'allennlp.modules.feedforward.FeedForward'>
2018-08-07 13:16:00,867 - INFO - allennlp.common.registrable - instantiating registered subclass relu of <class 'allennlp.nn.activations.Activation'>
2018-08-07 13:16:00,867 - INFO - allennlp.common.registrable - instantiating registered subclass linear of <class 'allennlp.nn.activations.Activation'>
<class 'allennlp.nn.initializers.InitializerApplicator'>
2018-08-07 13:16:00,869 - INFO - allennlp.common.registrable - instantiating registered subclass xavier_normal of <class 'allennlp.nn.initializers.Initializer'>
2018-08-07 13:16:00,869 - INFO - allennlp.common.registrable - instantiating registered subclass constant of <class 'allennlp.nn.initializers.Initializer'>
2018-08-07 13:16:00,869 - INFO - allennlp.common.registrable - instantiating registered subclass xavier_normal of <class 'allennlp.nn.initializers.Initializer'>
2018-08-07 13:16:00,870 - INFO - allennlp.common.registrable - instantiating registered subclass orthogonal of <class 'allennlp.nn.initializers.Initializer'>
2018-08-07 13:16:00,870 - INFO - allennlp.common.registrable - instantiating registered subclass constant of <class 'allennlp.nn.initializers.Initializer'>
2018-08-07 13:16:00,870 - INFO - allennlp.common.registrable - instantiating registered subclass kaiming_normal of <class 'allennlp.nn.initializers.Initializer'>
2018-08-07 13:16:00,989 - INFO - allennlp.common.checks - Pytorch version: 0.4.0
2018-08-07 13:16:00,989 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'token_indexers': {'token_characters': {'type': 'characters'}, 'tokens': {'lowercase_tokens': False, 'type': 'single_id'}}, 'type': 'snli'} and extras {}
2018-08-07 13:16:00,990 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.snli.SnliReader'> from params {'token_indexers': {'token_characters': {'type': 'characters'}, 'tokens': {'lowercase_tokens': False, 'type': 'single_id'}}} and extras {}
2018-08-07 13:16:00,990 - INFO - allennlp.common.from_params - instantiating class allennlp.data.token_indexers.token_indexer.TokenIndexer from params {'type': 'characters'} and extras {}
2018-08-07 13:16:00,990 - INFO - allennlp.common.from_params - instantiating class allennlp.data.token_indexers.token_characters_indexer.TokenCharactersIndexer from params {} and extras {}
2018-08-07 13:16:00,990 - INFO - allennlp.common.from_params - instantiating class allennlp.data.token_indexers.token_indexer.TokenIndexer from params {'lowercase_tokens': False, 'type': 'single_id'} and extras {}
2018-08-07 13:16:00,990 - INFO - allennlp.common.from_params - instantiating class allennlp.data.token_indexers.single_id_token_indexer.SingleIdTokenIndexer from params {'lowercase_tokens': False} and extras {}
2018-08-07 13:16:01,347 - INFO - allennlp.commands.evaluate - Reading evaluation data from https://s3-us-west-2.amazonaws.com/allennlp/datasets/snli/snli_1.0_test.jsonl
0it [00:00, ?it/s]2018-08-07 13:16:01,455 - INFO - allennlp.data.dataset_readers.snli - Reading SNLI instances from jsonl dataset at: /Users/michael/.allennlp/datasets/219774d715c7588fae25ccd6e54107fa016eaa87369c95f86e9de5e4f8e8b0b7.02752124e6570073d50876f7f66a32191d9787dcb992c56e79f558cebf5faf92
9824it [00:06, 1484.55it/s]
2018-08-07 13:16:07,966 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.data_iterator.DataIterator'> from params {'batch_size': 16, 'padding_noise': 0.1, 'sorting_keys': [['premise', 'num_tokens'], ['hypothesis', 'num_tokens']], 'type': 'bucket'} and extras {}
2018-08-07 13:16:07,966 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.iterators.bucket_iterator.BucketIterator'> from params {'batch_size': 16, 'padding_noise': 0.1, 'sorting_keys': [['premise', 'num_tokens'], ['hypothesis', 'num_tokens']]} and extras {}
2018-08-07 13:21:59,813 - INFO - allennlp.commands.evaluate - Finished evaluating.
2018-08-07 13:21:59,813 - INFO - allennlp.commands.evaluate - Metrics:
2018-08-07 13:21:59,814 - INFO - allennlp.commands.evaluate - accuracy: 0.8711319218241043

I wasn't able to run the evaluation on Quora.

$ allennlp evaluate https://s3-us-west-1.amazonaws.com/handsomezebra/public/quora_bimpm.tar.gz https://s3-us-west-1.amazonaws.com/handsomezebra/public/Quora_question_pair_partition.zip --include-package hznlp --cuda-device -1          (allennlp-pip) 
/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192, got 176
  return f(*args, **kwds)
/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192, got 176
  return f(*args, **kwds)
2018-08-07 13:24:45,634 - INFO - allennlp.common.file_utils - https://s3-us-west-1.amazonaws.com/handsomezebra/public/quora_bimpm.tar.gz not found in cache, downloading to /var/folders/v5/tlk1sh3d0l94fcmz10vtzmyh0000gn/T/tmpbgli5w9i
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 153993824/153993824 [05:35<00:00, 459139.50B/s]
2018-08-07 13:30:21,275 - INFO - allennlp.common.file_utils - copying /var/folders/v5/tlk1sh3d0l94fcmz10vtzmyh0000gn/T/tmpbgli5w9i to cache at /Users/michael/.allennlp/datasets/8993af486a8bbdd6daf257e4f3195b1f0ac82639d344d2f123f4aa3f563f8259.c8d8d072a2ec7c0b3830ec4d872edd3f1ae5f406c95a01274da3a51125cf5a4a
2018-08-07 13:30:21,537 - INFO - allennlp.common.file_utils - creating metadata file for /Users/michael/.allennlp/datasets/8993af486a8bbdd6daf257e4f3195b1f0ac82639d344d2f123f4aa3f563f8259.c8d8d072a2ec7c0b3830ec4d872edd3f1ae5f406c95a01274da3a51125cf5a4a
2018-08-07 13:30:21,538 - INFO - allennlp.common.file_utils - removing temp file /var/folders/v5/tlk1sh3d0l94fcmz10vtzmyh0000gn/T/tmpbgli5w9i
2018-08-07 13:30:21,552 - INFO - allennlp.models.archival - loading archive file https://s3-us-west-1.amazonaws.com/handsomezebra/public/quora_bimpm.tar.gz from cache at /Users/michael/.allennlp/datasets/8993af486a8bbdd6daf257e4f3195b1f0ac82639d344d2f123f4aa3f563f8259.c8d8d072a2ec7c0b3830ec4d872edd3f1ae5f406c95a01274da3a51125cf5a4a
2018-08-07 13:30:21,553 - INFO - allennlp.models.archival - extracting archive file /Users/michael/.allennlp/datasets/8993af486a8bbdd6daf257e4f3195b1f0ac82639d344d2f123f4aa3f563f8259.c8d8d072a2ec7c0b3830ec4d872edd3f1ae5f406c95a01274da3a51125cf5a4a to temp dir /var/folders/v5/tlk1sh3d0l94fcmz10vtzmyh0000gn/T/tmp0wf5w0y3
2018-08-07 13:30:23,046 - INFO - allennlp.common.registrable - instantiating registered subclass bimpm of <class 'allennlp.models.model.Model'>
2018-08-07 13:30:23,047 - INFO - allennlp.data.vocabulary - Loading token dictionary from /var/folders/v5/tlk1sh3d0l94fcmz10vtzmyh0000gn/T/tmp0wf5w0y3/vocabulary.
2018-08-07 13:30:23,205 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'aggregator': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 100, 'input_size': 264, 'num_layers': 2, 'type': 'lstm'}, 'classifier_feedforward': {'activations': ['relu', 'linear'], 'dropout': [0.1, 0], 'hidden_dims': [200, 2], 'input_dim': 400, 'num_layers': 2}, 'dropout': 0.1, 'encoder1': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 200, 'input_size': 400, 'num_layers': 1, 'type': 'lstm'}, 'encoder2': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 200, 'input_size': 400, 'num_layers': 1, 'type': 'lstm'}, 'initializer': [['.*linear_layers.*weight', {'type': 'xavier_normal'}], ['.*linear_layers.*bias', {'type': 'constant', 'val': 0}], ['.*weight_ih.*', {'type': 'xavier_normal'}], ['.*weight_hh.*', {'type': 'orthogonal'}], ['.*bias.*', {'type': 'constant', 'val': 0}], ['.*matcher.*params.*', {'type': 'kaiming_normal'}]], 'matcher_bw1': {'hidden_dim': 200, 'is_forward': False, 'num_perspective': 10}, 'matcher_bw2': {'hidden_dim': 200, 'is_forward': False, 'num_perspective': 10}, 'matcher_fw1': {'hidden_dim': 200, 'is_forward': True, 'num_perspective': 10}, 'matcher_fw2': {'hidden_dim': 200, 'is_forward': True, 'num_perspective': 10}, 'text_field_embedder': {'token_characters': {'embedding': {'embedding_dim': 20, 'padding_index': 0}, 'encoder': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 50, 'input_size': 20, 'num_layers': 1, 'type': 'gru'}, 'type': 'character_encoding'}, 'tokens': {'embedding_dim': 300, 'padding_index': 0, 'trainable': False, 'type': 'embedding'}}, 'type': 'bimpm', 'word_matcher': {'hidden_dim': 400, 'is_forward': True, 'num_perspective': 10, 'wo_full_match': True}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x104705a58>}
2018-08-07 13:30:23,206 - INFO - allennlp.common.from_params - instantiating class <class 'hznlp.models.bimpm.BiMPM'> from params {'aggregator': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 100, 'input_size': 264, 'num_layers': 2, 'type': 'lstm'}, 'classifier_feedforward': {'activations': ['relu', 'linear'], 'dropout': [0.1, 0], 'hidden_dims': [200, 2], 'input_dim': 400, 'num_layers': 2}, 'dropout': 0.1, 'encoder1': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 200, 'input_size': 400, 'num_layers': 1, 'type': 'lstm'}, 'encoder2': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 200, 'input_size': 400, 'num_layers': 1, 'type': 'lstm'}, 'initializer': [['.*linear_layers.*weight', {'type': 'xavier_normal'}], ['.*linear_layers.*bias', {'type': 'constant', 'val': 0}], ['.*weight_ih.*', {'type': 'xavier_normal'}], ['.*weight_hh.*', {'type': 'orthogonal'}], ['.*bias.*', {'type': 'constant', 'val': 0}], ['.*matcher.*params.*', {'type': 'kaiming_normal'}]], 'matcher_bw1': {'hidden_dim': 200, 'is_forward': False, 'num_perspective': 10}, 'matcher_bw2': {'hidden_dim': 200, 'is_forward': False, 'num_perspective': 10}, 'matcher_fw1': {'hidden_dim': 200, 'is_forward': True, 'num_perspective': 10}, 'matcher_fw2': {'hidden_dim': 200, 'is_forward': True, 'num_perspective': 10}, 'text_field_embedder': {'token_characters': {'embedding': {'embedding_dim': 20, 'padding_index': 0}, 'encoder': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 50, 'input_size': 20, 'num_layers': 1, 'type': 'gru'}, 'type': 'character_encoding'}, 'tokens': {'embedding_dim': 300, 'padding_index': 0, 'trainable': False, 'type': 'embedding'}}, 'word_matcher': {'hidden_dim': 400, 'is_forward': True, 'num_perspective': 10, 'wo_full_match': True}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x104705a58>}
<class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'>
2018-08-07 13:30:23,206 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'token_characters': {'embedding': {'embedding_dim': 20, 'padding_index': 0}, 'encoder': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 50, 'input_size': 20, 'num_layers': 1, 'type': 'gru'}, 'type': 'character_encoding'}, 'tokens': {'embedding_dim': 300, 'padding_index': 0, 'trainable': False, 'type': 'embedding'}} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x104705a58>}
2018-08-07 13:30:23,207 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding': {'embedding_dim': 20, 'padding_index': 0}, 'encoder': {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 50, 'input_size': 20, 'num_layers': 1, 'type': 'gru'}, 'type': 'character_encoding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x104705a58>}
2018-08-07 13:30:23,208 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder'> from params {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 50, 'input_size': 20, 'num_layers': 1, 'type': 'gru'} and extras {}
/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/site-packages/torch/nn/modules/rnn.py:38: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.1 and num_layers=1
  "num_layers={}".format(dropout, num_layers))
2018-08-07 13:30:23,211 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'embedding_dim': 300, 'padding_index': 0, 'trainable': False, 'type': 'embedding'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x104705a58>}
<class 'hznlp.models.matching_layer.MatchingLayer'>
2018-08-07 13:30:23,559 - INFO - allennlp.common.from_params - instantiating class <class 'hznlp.models.matching_layer.MatchingLayer'> from params {'hidden_dim': 400, 'is_forward': True, 'num_perspective': 10, 'wo_full_match': True} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x104705a58>}
<class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'>
2018-08-07 13:30:23,560 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 200, 'input_size': 400, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x104705a58>}
<class 'hznlp.models.matching_layer.MatchingLayer'>
2018-08-07 13:30:23,566 - INFO - allennlp.common.from_params - instantiating class <class 'hznlp.models.matching_layer.MatchingLayer'> from params {'hidden_dim': 200, 'is_forward': True, 'num_perspective': 10} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x104705a58>}
<class 'hznlp.models.matching_layer.MatchingLayer'>
2018-08-07 13:30:23,567 - INFO - allennlp.common.from_params - instantiating class <class 'hznlp.models.matching_layer.MatchingLayer'> from params {'hidden_dim': 200, 'is_forward': False, 'num_perspective': 10} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x104705a58>}
<class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'>
2018-08-07 13:30:23,567 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 200, 'input_size': 400, 'num_layers': 1, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x104705a58>}
<class 'hznlp.models.matching_layer.MatchingLayer'>
2018-08-07 13:30:23,577 - INFO - allennlp.common.from_params - instantiating class <class 'hznlp.models.matching_layer.MatchingLayer'> from params {'hidden_dim': 200, 'is_forward': True, 'num_perspective': 10} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x104705a58>}
<class 'hznlp.models.matching_layer.MatchingLayer'>
2018-08-07 13:30:23,578 - INFO - allennlp.common.from_params - instantiating class <class 'hznlp.models.matching_layer.MatchingLayer'> from params {'hidden_dim': 200, 'is_forward': False, 'num_perspective': 10} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x104705a58>}
<class 'allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder'>
2018-08-07 13:30:23,579 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2vec_encoders.seq2vec_encoder.Seq2VecEncoder'> from params {'bidirectional': True, 'dropout': 0.1, 'hidden_size': 100, 'input_size': 264, 'num_layers': 2, 'type': 'lstm'} and extras {'vocab': <allennlp.data.vocabulary.Vocabulary object at 0x104705a58>}
<class 'allennlp.modules.feedforward.FeedForward'>
2018-08-07 13:30:23,584 - INFO - allennlp.common.registrable - instantiating registered subclass relu of <class 'allennlp.nn.activations.Activation'>
2018-08-07 13:30:23,584 - INFO - allennlp.common.registrable - instantiating registered subclass linear of <class 'allennlp.nn.activations.Activation'>
<class 'allennlp.nn.initializers.InitializerApplicator'>
2018-08-07 13:30:23,585 - INFO - allennlp.common.registrable - instantiating registered subclass xavier_normal of <class 'allennlp.nn.initializers.Initializer'>
2018-08-07 13:30:23,585 - INFO - allennlp.common.registrable - instantiating registered subclass constant of <class 'allennlp.nn.initializers.Initializer'>
2018-08-07 13:30:23,586 - INFO - allennlp.common.registrable - instantiating registered subclass xavier_normal of <class 'allennlp.nn.initializers.Initializer'>
2018-08-07 13:30:23,586 - INFO - allennlp.common.registrable - instantiating registered subclass orthogonal of <class 'allennlp.nn.initializers.Initializer'>
2018-08-07 13:30:23,586 - INFO - allennlp.common.registrable - instantiating registered subclass constant of <class 'allennlp.nn.initializers.Initializer'>
2018-08-07 13:30:23,586 - INFO - allennlp.common.registrable - instantiating registered subclass kaiming_normal of <class 'allennlp.nn.initializers.Initializer'>
2018-08-07 13:30:23,843 - INFO - allennlp.common.checks - Pytorch version: 0.4.0
2018-08-07 13:30:23,843 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'lazy': False, 'token_indexers': {'token_characters': {'type': 'characters'}, 'tokens': {'lowercase_tokens': False, 'type': 'single_id'}}, 'tokenizer': {'type': 'word', 'word_splitter': {'type': 'just_spaces'}}, 'type': 'quora_paraphrase'} and extras {}
2018-08-07 13:30:23,843 - INFO - allennlp.common.from_params - instantiating class <class 'hznlp.dataset_readers.quora_paraphrase.QuoraParaphraseDatasetReader'> from params {'lazy': False, 'token_indexers': {'token_characters': {'type': 'characters'}, 'tokens': {'lowercase_tokens': False, 'type': 'single_id'}}, 'tokenizer': {'type': 'word', 'word_splitter': {'type': 'just_spaces'}}} and extras {}
<class 'allennlp.data.tokenizers.tokenizer.Tokenizer'>
2018-08-07 13:30:23,844 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.tokenizers.tokenizer.Tokenizer'> from params {'type': 'word', 'word_splitter': {'type': 'just_spaces'}} and extras {}
2018-08-07 13:30:23,844 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.tokenizers.word_tokenizer.WordTokenizer'> from params {'word_splitter': {'type': 'just_spaces'}} and extras {}
<class 'allennlp.data.tokenizers.word_splitter.WordSplitter'>
2018-08-07 13:30:23,844 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.tokenizers.word_splitter.WordSplitter'> from params {'type': 'just_spaces'} and extras {}
2018-08-07 13:30:23,844 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.tokenizers.word_splitter.JustSpacesWordSplitter'> from params {} and extras {}
2018-08-07 13:30:23,845 - INFO - allennlp.common.from_params - instantiating class allennlp.data.token_indexers.token_indexer.TokenIndexer from params {'type': 'characters'} and extras {}
2018-08-07 13:30:23,845 - INFO - allennlp.common.from_params - instantiating class allennlp.data.token_indexers.token_characters_indexer.TokenCharactersIndexer from params {} and extras {}
2018-08-07 13:30:23,845 - INFO - allennlp.common.from_params - instantiating class allennlp.data.token_indexers.token_indexer.TokenIndexer from params {'lowercase_tokens': False, 'type': 'single_id'} and extras {}
2018-08-07 13:30:23,845 - INFO - allennlp.common.from_params - instantiating class allennlp.data.token_indexers.single_id_token_indexer.SingleIdTokenIndexer from params {'lowercase_tokens': False} and extras {}
2018-08-07 13:30:23,845 - INFO - allennlp.commands.evaluate - Reading evaluation data from https://s3-us-west-1.amazonaws.com/handsomezebra/public/Quora_question_pair_partition.zip
0it [00:00, ?it/s]2018-08-07 13:30:23,845 - INFO - hznlp.dataset_readers.quora_paraphrase - Reading instances from lines in file at: https://s3-us-west-1.amazonaws.com/handsomezebra/public/Quora_question_pair_partition.zip
Traceback (most recent call last):
  File "/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/site-packages/allennlp/run.py", line 18, in <module>
    main(prog="allennlp")
  File "/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/site-packages/allennlp/commands/__init__.py", line 70, in main
    args.func(args)
  File "/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/site-packages/allennlp/commands/evaluate.py", line 142, in evaluate_from_args
    instances = dataset_reader.read(evaluation_data_path)
  File "/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/site-packages/allennlp/data/dataset_readers/dataset_reader.py", line 73, in read
    instances = [instance for instance in Tqdm.tqdm(instances)]
  File "/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/site-packages/allennlp/data/dataset_readers/dataset_reader.py", line 73, in <listcomp>
    instances = [instance for instance in Tqdm.tqdm(instances)]
  File "/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/site-packages/tqdm/_tqdm.py", line 931, in __iter__
    for obj in iterable:
  File "/Users/michael/hack/other/nlp/hznlp/dataset_readers/quora_paraphrase.py", line 62, in _read
    for row in tsvin:
  File "/Users/michael/miniconda3/envs/allennlp-pip/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb1 in position 11: invalid start byte

Are you still planning on submitting a pull request @handsomezebra?

hzeng-otterai commented 6 years ago

@schmmd Thanks for trying it. Let me check what's going on in the Quora part. I am in the progress of creating PR, working on some issues in unit tests.

hzeng-otterai commented 6 years ago

@schmmd Please use the following command to evaluate Quora model since it's reading the test file within the zip file.

allennlp evaluate https://s3-us-west-1.amazonaws.com/handsomezebra/public/quora_bimpm.tar.gz \(https://s3-us-west-1.amazonaws.com/handsomezebra/public/Quora_question_pair_partition.zip\)#Quora_question_pair_partition/test.tsv --include-package hznlp --cuda-device -1

hzeng-otterai commented 6 years ago

BiMPM implementation merged to master.

allenai / allennlp

BiMPM model implementation #1503