UKPLab / emnlp2017-bilstm-cnn-crf

BiLSTM-CNN-CRF architecture for sequence tagging
Apache License 2.0
825 stars 263 forks source link

TypeError: unsupported operand type(s) for *: 'IndexedSlices' and 'int' #1

Closed SeekPoint closed 6 years ago

SeekPoint commented 7 years ago

python Train_NER_German.py Using TensorFlow backend. Generate new embeddings files for a dataset: pkl/GermEval_2014_tudarmstadt_german_50mincount.vocab.pkl Read file: 2014_tudarmstadt_german_50mincount.vocab.gz Added words: 81 Unknown-Tokens: 3.73% Unknown-Tokens: 3.89% Unknown-Tokens: 3.73% DONE - Embeddings file saved: pkl/GermEval_2014_tudarmstadt_german_50mincount.vocab.pkl Dataset: GermEval ['NER_IOBES', 'NER_IOB', 'NER_BIO', 'tokens', 'casing', 'characters', 'NER_class'] Label key: NER_BIO Train Sentences: 24000 Dev Sentences: 2200 Test Sentences: 5100 BiLSTM model initialized with parameters: {'clipnorm': 1, 'optimizer': 'nadam', 'dropout': [0.25, 0.25], 'miniBatchSize': 32, 'earlyStopping': 5, 'addFeatureDimensions': 10, 'charFilterLength': 3, 'charLSTMSize': 25, 'charFilterSize': 30, 'classifier': 'CRF', 'clipvalue': 0, 'charEmbeddingsSize': 30, 'charEmbeddings': 'CNN', 'LSTM-Size': [100, 75]} 24000 train sentences 2200 dev sentences 5100 test sentences --------- Epoch 1 -----------


Layer (type) Output Shape Param # Connected to

token_emd (Embedding) (None, None, 100) 64854500


casing_emd (Embedding) (None, None, 8) 64


char_emd (TimeDistributed) (None, None, 51, 30) 2850


char_cnn (TimeDistributed) (None, None, 51, 30) 2730


char_pooling (TimeDistributed) (None, None, 30) 0


varLSTM_1 (Bidirectional) (None, None, 200) 191200 merge_1[0][0]


varLSTM_2 (Bidirectional) (None, None, 150) 165600 varLSTM_1[0][0]


hidden_layer (TimeDistributed) (None, None, 25) 3775 varLSTM_2[0][0]


chaincrf_1 (ChainCRF) (None, None, 25) 675 hidden_layer[0][0]

Total params: 65,221,394 Trainable params: 366,830 Non-trainable params: 64,854,564


Traceback (most recent call last): File "Train_NER_German.py", line 86, in model.evaluate(50) File "/Users/emnlp2017-bilstm-cnn-crf/neuralnets/BiLSTM.py", line 391, in evaluate self.trainModel() File "/Users/emnlp2017-bilstm-cnn-crf/neuralnets/BiLSTM.py", line 107, in trainModel self.model.train_on_batch(nnInput, labels)
File "/Library/Python/2.7/site-packages/keras/models.py", line 766, in train_on_batch class_weight=class_weight) File "/Library/Python/2.7/site-packages/keras/engine/training.py", line 1319, in train_on_batch self._make_train_function() File "/Library/Python/2.7/site-packages/keras/engine/training.py", line 760, in _make_train_function self.total_loss) File "/Library/Python/2.7/site-packages/keras/optimizers.py", line 562, in get_updates grads = self.get_gradients(loss, params) File "/Library/Python/2.7/site-packages/keras/optimizers.py", line 85, in get_gradients grads = [clip_norm(g, self.clipnorm, norm) for g in grads] File "/Library/Python/2.7/site-packages/keras/optimizers.py", line 14, in clip_norm g = K.switch(n >= c, g c / n, g) TypeError: unsupported operand type(s) for : 'IndexedSlices' and 'int'

nreimers commented 7 years ago

Hi, I think the issue is with the 'charEmbeddings' and the selection of CNN. In my experiemts Tensorflow couldn't handle this correctly and throws this exceptions.

Solutions: Either disable the character-based representations (charEmbeddings: None) or use Theano instead of tensorflow as a backend for Keras.