Closed WenmuZhou closed 7 years ago
All labels must be nonnegative integers, batch: 0 labels: 12,4,14,0,-1,-1,10
means that your lookup table isn't able to find a proper symbol to decode (the -1
). The problem comes from this line, since the current implementation of string_split
doesn't take utf8
format into account. In the Symbols list, the symbol °
has b'\xc2\xb0'
utf8-encoding and this is treated as 2 symbols when split with string_split
(and then your lookup table doesn't have any entry for °
). I forgot to update it on Github but the solution is simply to remove °
from the list of symbols and use only characters that don't have fancy encodings in 'utf8' format.
when I use Chinese characters to train the model, the error is All labels must be nonnegative integers, batch: 0 labels: -1,-1,-1,-1,-1,-1,-1
, so the reason is also because of this function, so ,how can I fixed this problem to train with Chinese characters
I have tried to add a $
after each character in labels except for last character,eg. ($=$/$-
, and set the string_split
line like this
splited = tf.string_split(labels, delimiter='$')
but there are some error, It likes the function string_split
do not work
Invalid argument: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 0 num_classes: 16 labels: 13
2017-11-10 22:15:45.035575: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 0 num_classes: 16 labels: 13
[[Node: CTCLoss = CTCLoss[ctc_merge_repeated=true, ignore_longer_outputs_than_inputs=true, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/cpu:0"](deep_bidirectional_lstm/transpose_time_major/_511, str2code_conversion/StringSplit, str2code_conversion/hash_table_Lookup, Cast_3/_589)]]
Traceback (most recent call last):
File "train.py", line 99, in <module>
image_summaries=True))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 241, in train
loss = self._train_model(input_fn=input_fn, hooks=hooks)
│ │ └ []
│ └ <function data_loader.<locals>.input_fn at 0x7f6df54bdf28>
└ <tensorflow.python.estimator.estimator.Estimator object at 0x7f6df54be908>
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/estimator/estimator.py", line 686, in _train_model
_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
│ │ │ └ EstimatorSpec(predictions={'words': <tf.Tensor 'code2str_conversion/chars_conversion/cond/Merge:0' shape=(?,) dtype=string>, 'ra...
│ │ └ EstimatorSpec(predictions={'words': <tf.Tensor 'code2str_conversion/chars_conversion/cond/Merge:0' shape=(?,) dtype=string>, 'ra...
│ └ <tensorflow.python.training.monitored_session.MonitoredSession object at 0x7f6df43042e8>
└ None
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 518, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 862, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 818, in run
return self._sess.run(*args, **kwargs)
│ │ └ {'run_metadata': None, 'options': None, 'feed_dict': None}
│ └ ([<tf.Operation 'group_deps' type=NoOp>, <tf.Tensor 'Print:0' shape=() dtype=float32>],)
└ <tensorflow.python.training.monitored_session._CoordinatedSession object at 0x7f6deffbc390>
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 972, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 818, in run
return self._sess.run(*args, **kwargs)
│ │ └ {'run_metadata': , 'options': , 'feed_dict': None, 'fetches': {'caller': [<tf.Operation 'group_deps' type=NoOp>, <tf.Tensor 'Pri...
│ └ ()
└ <tensorflow.python.training.monitored_session._HookedSession object at 0x7f6deffbc2e8>
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
│ │ └ 'Saw a non-null label (index >= num_classes - 1) following a null label, batch: 0 num_classes: 16 labels: 13\n\t [[Node: CTCLoss...
│ └ <tf.Operation 'CTCLoss' type=CTCLoss>
└ name: "CTCLoss"
op: "CTCLoss"
input: "deep_bidirectional_lstm/transpose_time_major"
input: "str2code_conversion/StringSplit"
inp...
tensorflow.python.framework.errors_impl.InvalidArgumentError: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 0 num_classes: 16 labels: 13
[[Node: CTCLoss = CTCLoss[ctc_merge_repeated=true, ignore_longer_outputs_than_inputs=true, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/cpu:0"](deep_bidirectional_lstm/transpose_time_major/_511, str2code_conversion/StringSplit, str2code_conversion/hash_table_Lookup, Cast_3/_589)]]
[[Node: code2str_conversion/chars_conversion/Shape/_531 = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_1842_code2str_conversion/chars_conversion/Shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
Caused by op 'CTCLoss', defined at:
File "train.py", line 99, in <module>
image_summaries=True))
I have solved this problem. When making datasets, use the Chinese characters in the Alphabet index to represent the label, the index of each Chinese character is separated by a special symbol, for example, '$'. eg.
After that , I change the code
with tf.name_scope('str2code_conversion'):
table_str2int = tf.contrib.lookup.HashTable(tf.contrib.lookup.KeyValueTensorInitializer(keys, values), -1)
splited = tf.string_split(labels, delimiter='') # TODO change string split to utf8 split in next tf version
codes = table_str2int.lookup(splited.values)
sparse_code_target = tf.SparseTensor(splited.indices, codes, splited.dense_shape)
to
with tf.name_scope('str2code_conversion'):
splited = tf.string_split(labels, delimiter='$') # TODO change string split to utf8 split in next tf version
sparse_code_target = tf.SparseTensor(splited.indices, tf.cast(tf.string_to_number(splited.values),tf.int32), splited.dense_shape)
finally, the code is work
thank you for the reply this resolved the previous error by making dataset according WenmuZhou's method But now i am facing following error Saw a non-null label (index >= num_classes - 1) following a null label, batch: 0 num_classes: 80 labels: 40,39,42,32,70
Hi @meaatef , @WenmuZhou , @solivr ,
I need help on this issue as well.
My symbol list includes
Symbols = " .,:;-_=()[]{}/%<>"
I get the same error. Do I need to use escaping characters in the labels file? A sample entry of this in the file looks like the following.
Image10260.jpg;\)nyenpc\>
Image27765.jpg;QZYN\<
I tried @WenmuZhou's fix but still doesn't work for me. Any insights on this one?
Thanks,
@codecolony Map your each label to a number as WenmuZhou did . it worked for me now i'm facing only problem that loss is not converging
@codecolony you don't need escaping, but you need that your labels contain only letters from the alphabet, and that each letter from the alphabet can be mapped to a single number. WenmuZhou's trick is quite neat for that because it handles multi-byte characters.
Alternatively, you can use a different encoding.
Finally, I am using the following code to check the input labels:
# put this in train.py , after parsing parameters
# check input had conforming alphabet
params_alphabet = set(parameters.alphabet)
input_alphabet = set()
for filename in parameters.csv_files_train + parameters.csv_files_eval:
with open(filename, encoding='latin1') as file: # I use latin1 encoding in order to deal with éèç etc
for line in file:
input_alphabet.update(line.split(parameters.csv_delimiter, maxsplit=1)[1])
for sep in '\n\r':
input_alphabet.discard(sep)
extra_chars = input_alphabet - params_alphabet
assert len(extra_chars) == 0, 'Invalid char %s in file %s' % (extra_chars, filename)
model_params = {
'Params': parameters,
}
the Symbols is
when I run the code I meet the error
and the image i used is like this