Closed devily closed 6 years ago
My chinese corpus is CoNLL-2012 chinese tokens and pos tagging, such as:
各位/PN 好/VA ,/PU 欢迎/VV 您/PN 收看/VV 国际/NN 频道/NN 的/DEG 今日/NT 关注/NN 。/PU 今天/NT 在/P 北京/NR 举行/VV 的/DEC 禽流感/NN 防控/NN 国际/NN 筹资/NN 大会/NN 部长级/JJ 会议/NN 上/LC ,/PU 中国/NR 国务院/NN 总理/NN 温家宝/NR 宣布/VV ,/PU 为/P 支持/VV 全球/NN 的/DEG 禽流感/NN 防控/NN 事业/NN ,/PU 中国/NR 政府/NN 决定/VV 提供/VV 一千万/CD 美元/M ,/PU 并且/CC 迅速/AD 到位/VV 。/PU 那么/AD 现在/NT 全/DT 世界/NN 的/DEG 禽流感/NN 蔓延/VV 的/DEC 趋势/NN 如何/VA ?/PU
This is a duplicate of #1954. You can solve this by adding min_padding_length
to your token indexer. See that issue for more detail on what's going on and how exactly to use min_padding_length
, and feel free to ask more questions if that's not enough.
System (please complete the following information):
Question Ask about something you don't understand, such as:
I tried to use simple tagger tutorial with chinese corpus. RuntimeError happened when I trained the model. I wonder what should I do?
2018-11-13 11:44:47,132 - INFO - allennlp.common.params - CURRENTLY DEFINED PARAMETERS: 2018-11-13 11:44:47,132 - INFO - allennlp.common.params - trainer.num_serialized_models_to_keep = 20 2018-11-13 11:44:47,132 - INFO - allennlp.common.params - trainer.keep_serialized_model_every_num_seconds = None 2018-11-13 11:44:47,132 - INFO - allennlp.common.params - trainer.model_save_interval = None 2018-11-13 11:44:47,132 - INFO - allennlp.common.params - trainer.summary_interval = 100 2018-11-13 11:44:47,132 - INFO - allennlp.common.params - trainer.histogram_interval = None 2018-11-13 11:44:47,132 - INFO - allennlp.common.params - trainer.should_log_parameter_statistics = True 2018-11-13 11:44:47,133 - INFO - allennlp.common.params - trainer.should_log_learning_rate = False 2018-11-13 11:44:47,135 - INFO - allennlp.common.params - evaluate_on_test = False 2018-11-13 11:44:47,135 - INFO - allennlp.training.trainer - Beginning training. 2018-11-13 11:44:47,135 - INFO - allennlp.training.trainer - Epoch 0/39 2018-11-13 11:44:47,135 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 253.0304 2018-11-13 11:44:47,141 - INFO - allennlp.training.trainer - Training accuracy: 0.2035, accuracy3: 0.4699, loss: 2.9571 ||: 26%|##5 | 36/140 [00:13<00:39, 2.63it/s]Traceback (most recent call last): File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/allennlp/run.py", line 18, in
main(prog="allennlp")
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/allennlp/commands/init.py", line 72, in main
args.func(args)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/allennlp/commands/train.py", line 111, in train_model_from_args
args.force)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/allennlp/commands/train.py", line 142, in train_model_from_file
return train_model(params, serialization_dir, file_friendly_logging, recover, force)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/allennlp/commands/train.py", line 346, in train_model
metrics = trainer.train()
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/allennlp/training/trainer.py", line 751, in train
train_metrics = self._train_epoch(epoch)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/allennlp/training/trainer.py", line 494, in _train_epoch
loss = self.batch_loss(batch, for_training=True)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/allennlp/training/trainer.py", line 429, in batch_loss
output_dict = self.model(batch)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, *kwargs)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/allennlp/models/simple_tagger.py", line 98, in forward
embedded_text_input = self.text_field_embedder(tokens)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(input, kwargs)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/allennlp/modules/text_field_embedders/basic_text_field_embedder.py", line 88, in forward
token_vectors = embedder(tensors)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(input, kwargs)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/allennlp/modules/token_embedders/token_characters_encoder.py", line 36, in forward
return self._dropout(self._encoder(self._embedding(token_characters), mask))
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, kwargs)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/allennlp/modules/time_distributed.py", line 35, in forward
reshaped_outputs = self._module(reshaped_inputs)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(input, kwargs)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/allennlp/modules/seq2vec_encoders/cnn_encoder.py", line 105, in forward
self._activation(convolution_layer(tokens)).max(dim=2)[0]
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, kwargs)
File "/Users/xiaopei/anaconda2/envs/allennlp/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 176, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Calculated padded input size per channel: (1 x 4). Kernel size: (1 x 140428250710021). Kernel size can't be greater than actual input size at /Users/soumith/code/builder/wheel/pytorch-src/aten/src/THNN/generic/SpatialConvolutionMM.c:48
[INFO/MainProcess] process shutting down