Closed PetrochukM closed 6 years ago
Yes, this is a known issue, unfortunately. The problem is that you're specifying 5-grams for your CNN filters, but you have no word that is over 4 characters, so the CNN filter crashes. Detecting this and fixing it on our side is actually quite hard, but it's relatively easy to fix in your configuration. Either you can change the size of the filters in your CNN, or add padding around the characters so that you'll always have at least five characters (I'd recommend the second option). You can see how we do that with BiDAF here: https://github.com/allenai/allennlp/blob/83f0c5ecaa1020371d7788f6683764ff806dbe36/training_config/bidaf.json#L9-L16
You could just do "end_tokens": [0, 0, 0, 0]
, making sure there's always enough 0 padding, or you could put explicit begin and end markers, like the BiDAF model does.
Added this line:
"character_tokenizer": {
"byte_encoding": "utf-8",
"end_tokens": [0, 0, 0, 0]
},
I am assuming that I need to retrain the model. The error still occurs.
Using tutorials/getting_started/simple_tagger.json
.
You shouldn't have to retrain the model (unless you weren't using byte encoding before - you can do this without byte encoding by changing [0, 0, 0, 0]
to ["@@PADDING@@", ...]
), but you do need to update the configuration that's inside the model archive.
Hi!
I tried to add those lines but then I get this weird error if I try to retrain with byte encoding. Without byte encoding and "@@PADDING@@" everything works!
0%| | 0/330 [00:00<?, ?it/s]2017-12-04 14:17:01,985 - INFO - allennlp.training.trainer - Training
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [34,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [36,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [37,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [38,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [39,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [40,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [41,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [42,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [43,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [44,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [45,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [46,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [33,0,0], thread: [47,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/michael/Desktop/lattice/allennlp/allennlp/run.py", line 13, in <module>
main(prog="python -m allennlp.run")
File "/home/michael/Desktop/lattice/allennlp/allennlp/commands/__init__.py", line 77, in main
args.func(args)
File "/home/michael/Desktop/lattice/allennlp/allennlp/commands/train.py", line 73, in train_model_from_args
train_model_from_file(args.param_path, args.serialization_dir)
File "/home/michael/Desktop/lattice/allennlp/allennlp/commands/train.py", line 89, in train_model_from_file
return train_model(params, serialization_dir)
File "/home/michael/Desktop/lattice/allennlp/allennlp/commands/train.py", line 178, in train_model
trainer.train()
File "/home/michael/Desktop/lattice/allennlp/allennlp/training/trainer.py", line 369, in train
train_metrics = self._train_epoch(epoch)
File "/home/michael/Desktop/lattice/allennlp/allennlp/training/trainer.py", line 221, in _train_epoch
loss = self._batch_loss(batch, for_training=True)
File "/home/michael/Desktop/lattice/allennlp/allennlp/training/trainer.py", line 176, in _batch_loss
output_dict = self._forward(batch, for_training=for_training)
File "/home/michael/Desktop/lattice/allennlp/allennlp/training/trainer.py", line 410, in _forward
return self._model.forward(**tensor_batch)
File "/home/michael/Desktop/lattice/allennlp/allennlp/models/simple_tagger.py", line 99, in forward
embedded_text_input = self.text_field_embedder(tokens)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/home/michael/Desktop/lattice/allennlp/allennlp/modules/text_field_embedders/basic_text_field_embedder.py", line 47, in forward
token_vectors = embedder(tensor)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/home/michael/Desktop/lattice/allennlp/allennlp/modules/token_embedders/token_characters_encoder.py", line 36, in forward
return self._dropout(self._encoder(self._embedding(token_characters), mask))
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/home/michael/Desktop/lattice/allennlp/allennlp/modules/time_distributed.py", line 35, in forward
reshaped_outputs = self._module(*reshaped_inputs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/home/michael/Desktop/lattice/allennlp/allennlp/modules/seq2vec_encoders/cnn_encoder.py", line 103, in forward
for convolution_layer in self._convolution_layers]
File "/home/michael/Desktop/lattice/allennlp/allennlp/modules/seq2vec_encoders/cnn_encoder.py", line 103, in <listcomp>
for convolution_layer in self._convolution_layers]
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py", line 154, in forward
self.padding, self.dilation, self.groups)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 83, in conv1d
return f(input, weight, bias)
RuntimeError: CUDNN_STATUS_NOT_INITIALIZED
If you want to use byte encoding, you also need to specify the number of embeddings in your character embedder: https://github.com/allenai/allennlp/blob/3995f70c1cb6190352eb5e063e3ba37b3121112f/training_config/bidaf.json#L31-L35
This is because you're no longer using the vocabulary to determine how many characters there are - the code will think you want an embedding matrix with zero entries, and things will fail, as you see.
Okay! Thanks for your help!
No problem!
Oh, for anyone stumbling upon this issue later, here's another workaround that I should have suggested first:
python -m allennlp.run predict /tmp/subject_recognition/model.tar.gz INPUT_FILE --overrides="dataset_reader.token_indexers.token_characters.character_tokenizer.end_tokens = ['@@PADDING@@', '@@PADDING@@']"
That is, the problem was with the fact that there wasn't enough padding on the characters. There's an option in the predict
command to override components of the configuration file. One of the items in the configuration file lets you add padding when tokenizing the characters. So you can just run the original command with the override, and it should fix the problem.
Input: {"sentence": "what is i miss you"}
Trace:
Full Output: