Closed mrgonext closed 6 years ago
I had the same issue but figured out that in my case it was because the data set was 'tab' separated. The data processor in the code expects 'whitespace' to the be the field separator. If this is you, in model/data_utils.py, in the overridden function __iter__
change line
ls = line.split(' ')
to
ls = line.split('\t')
Thank you for your help. I've figured out the issue. It was because eng.testa file is not good by downloaded wrong way. I've downloaded again and it worked. any way thank you.
Hi, Firstly, thank you for your sharing codes and instructions I'm trying to run with CoNLL2003 I've downloaded from here https://github.com/synalp/NER/tree/master/corpus/CoNLL-2003 and changed the config:
` filename_dev = "data/coNLL/eng/eng.testa" filename_test = "data/coNLL/eng/eng.testb" filename_train = "data/coNLL/eng/eng.train"
build data look good:
`python build_data.py Building vocab...
Then trains it, but got the exception
python train.py WARNING:tensorflow:From /SourceCode/keras/sequence_tagging/env/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version. Instructions for updating: Use the retry module or similar alternatives. From /SourceCode/keras/sequence_tagging/env/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version. Instructions for updating: Use the retry module or similar alternatives. /SourceCode/keras/sequence_tagging/env/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. "Converting sparse IndexedSlices to a dense Tensor of unknown shape. " Initializing tf session 2018-07-14 11:34:02.373966: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA Epoch 1 out of 15 Traceback (most recent call last): File "train.py", line 26, in <module> main() File "train.py", line 23, in main model.train(train, dev) File "/SourceCode/keras/sequence_tagging/model/base_model.py", line 121, in train score = self.run_epoch(train, dev, epoch) File "/SourceCode/keras/sequence_tagging/model/ner_model.py", line 278, in run_epoch nbatches = (len(train) + batch_size - 1) // batch_size File "/SourceCode/keras/sequence_tagging/model/data_utils.py", line 88, in __len__ for _ in self: File "/SourceCode/keras/sequence_tagging/model/data_utils.py", line 79, in __iter__ tag = self.processing_tag(tag) File "/SourceCode/keras/sequence_tagging/model/data_utils.py", line 274, in f raise Exception("Unknow key is not allowed. Check that "\ Exception: Unknow key is not allowed. Check that your vocab (tags?) is correct
Opening tags.txt it looks strange
data-line-number="30666"></td> data-line-number="424"></td> data-line-number="30258"></td> data-line-number="28747"></td> data-line-number="46862"></td> data-line-number="50137"></td> data-line-number="26256"></td>
Attached tags.txt here: tags.txtCould you help me point me how to resolve this issue? I'm not sure if I'm missing something.
thank you.
My Environment Details: MacOS high seirra 10.13.1 Python 3.6 Tensorflow 1.7.0