guillaumegenthial / tf_ner

Simple and Efficient Tensorflow implementations of NER models with tf.estimator and tf.data
Apache License 2.0
923 stars 275 forks source link

InvalidArgument `labels` contains negative values #8

Closed dutkaD closed 6 years ago

dutkaD commented 6 years ago

What can be the problem? Did anyone have something like this?

Saving checkpoints for 0 into results/model/model.ckpt.

Traceback (most recent call last): . . . .

InvalidArgumentError (see above for traceback): assertion failed: [`labels` contains negative values] [Condition x >= 0 did not hold element-wise:] [x (Reshape_5:0) = ] [8 12 -1...]
     [[{{node confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert}} = Assert[T=[DT_STRING, DT_STRING, DT_STRING, DT_INT64], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/Switch, confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/data_0, confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/data_1, confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/data_2, confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/Assert/Switch_1)]]
guillaumegenthial commented 6 years ago

@dutkaD you probably have an unknown label in your data! (i.e. your vocab file is missing one of the labels)

lizzy2626 commented 5 years ago

hi @guillaumegenthial: my evaluation is only on training_data. With your code, all words in vocab file but not in glove would be zero vector. So i think (i.e. your vocab file is missing one of the labels) cant happen. Do you have any ideas?

ghost commented 5 years ago

I had the same error. Turns out I was using '0' instead of 'O' in the tags.

lizzy2626 commented 5 years ago

I had the same error. Turns out I was using '0' instead of 'O' in the tags.

tks @apohl1111 . i found that too.

rashibudati commented 5 years ago

@lizzy2689 , i have the same error. i have checked the vocab's tag file, i have not mistyped anything. i dont understand what the problem is. could you help me fix this bug?

aman31kmr commented 4 years ago

@rashibudati Earlier I thought that we need to segregate train testa and testb in 80%,10% and 10% respectively. While doing that I was getting the error that is mentioned above. When I copied and pasted everything in all the files it worked. This code doesn't perform better than sequence tagging repository. Metric always shows the same number irrespective of the size of dataset we take.

dhgoratela commented 4 years ago

Hi @rashibudati , To share an update on this issue, I am using this solution to do something other than NER. I found that this issue comes because of line 45 of main.py. For padding tokens to match uneven length samples, @guillaumegenthial is using the label "O". This means padded tokens are being tagged as "Others". If your data does not have an "O" tag, then build_vocab.py script will not populate "O" in vocab.tags.txt. Hence during execution, it will find "O" tags that were used in padding as the foreign tags. This mismatch generates the error. The author's answer about missing tags is correct. To fix this, you need to manually add a capital "O" tag in a new line of your vocab.tags.txt after running build_vocab.py

wentinghome commented 4 years ago

Hi @rashibudati , To share an update on this issue, I am using this solution to do something other than NER. I found that this issue comes because of line 45 of main.py. For padding tokens to match uneven length samples, @guillaumegenthial is using the label "O". This means padded tokens are being tagged as "Others". If your data does not have an "O" tag, then build_vocab.py script will not populate "O" in vocab.tags.txt. Hence during execution, it will find "O" tags that were used in padding as the foreign tags. This mismatch generates the error. The author's answer about missing tags is correct. To fix this, you need to manually add a capital "O" tag in a new line of your vocab.tags.txt after running build_vocab.py

Solved my problem, thanks.