Closed anxingle closed 8 years ago
I write just as your code tell me.And it works well if comment the CTC functions. I really don't know what's wrong with it .
Thank you @anxingle , I'm very glad that you liked my post. Answering your questions:
seq_len
placeholder and the sparse placeholder for y
. Could you do that?Thank you very much. I will do what you told me as soon as I can.
I tried tf.int64.
Could you send me your dataset?
I have push the mnist dataset into the data , you can just git clone the repository.
I am really grateful to you.
Why are you trying to use CTC as a cost function? CTC is used when you don't have an alignment between your input and output and/or the output length vary along the samples. So, for one to one relationship (like one image one digit), CTC probably isn't the best solution for you. But, if you intend to use this code in a continuous hand writing recognition, CTC will work better. I'm looking your code and making some changes. As soon as possible I'll give you a feedback, ok?
Thank you for your reply. But in this code, I have 28 inputs, so it's a problem about many inputs( maybe laterly I'll add multi labels) maps to one label. My senior implement multi labels recognise framework mxnet warpctc and he told me it should be the best solution .
So nice !
Yes, but CTC works only for more than one label. I'll show you a working code, but I don't think that for this example CTC will outperform the softmax layer.
Got it! I change another dataset !
I made a working code and I put it on gist. You major issue was using the sparse place holder and the sequence length place holder. The targets required by CTC must not be encoded, you must provide as labels and you must feed the sparse place holder as a tuple of (indices, values, shape)
(that is generated by sparse_tuple_from); in the case for mnist, for batch you will have a target like
y = (
[[0, 0], [1, 0], [2, 0], ..., [batch_size-1, 0]],
[label_1, label_2, label_3, ..., label_batch_size],
[batch_size, 1]
)
And the seq_len
placeholder works to tell the run what is the size of each data in batch, but for MNIST, the network was feed with 28 inputs of length 28, so:
seq_len = [28 for _ in xrange(batch_size)]
I hope I could help you. If you have any question I'll be happy to answer you.
You can use this dataset, whose images have more than one digit and the number of digits differ from image to image. CTC may work better with this dataset.
I even don't know how to express my appreciation ! Thanks a lot.
You're welcome. If you have any questions, please feel free to ask.
Hi, igormq. It is very helpful to see your Blog talk about CTC on Tensorflow . Thank you a million. But I have some confusion about the CTC module.