Closed blackyang closed 7 years ago
Do you have a reference for what you are trying to implement? As well as your attempt so far.
Hi @fchollet , the original paper of CTC could be found here by Alex Graves.
Basically, CTC is a special loss function to handle alignment. For example, in speech recognition, suppose the input sequence has a length of t (then the output of RNN also has a length of t), usually the target sequence would have a length of w smaller than t. CTC saves the need for pre-segmentation of the inputs and post-segmentation of the net outputs.
I was trying to add a new cost function in objectives.py, based on this ctc.py file. The model could be compiled, however, there are several errors when I do model.fit(). I guess the reason lies in these lines, which implies that the two arguments to cost function should share same shape. Correct me if I misunderstand anything
@amaas implemented the CTC loss strictly faithful to the original paper in a very straightforward way.
It should be relatively straightforward to port our CTC implementation into the Keras framework. Note that our fast version is cython (which doesn't seem to be used elsewhere in Keras). Without cython the loops to compute alignments required to evaluate the CTC loss were painfully slow.
@amaas : do you have a theano version implementation? Or can your fast version work with theano?
@jedi00 No, we wrote our RNNs from scratch without Theano. If you want to replace the NN architecture though you could take just our CTC loss and make it a Theano function. It only needs to interact with the final layer so it should be mostly unchanged in a Theano implementation.
Hi @blackyang, did you implement Lasagne's CTC into Keras? If you did, could you tell me how to do?
Keras' loss objects are all functions defined in objectives.py
, and this seems being called from compile()
function in models.py
. It is wrapped with weighted_objective()
function, which call the loss function object with only two params y_true
and y_pred
. However, Lasagne's CTC is a class object, and the apply()
function seems to require 4 params. I'm stuck here.
Thank you.
Hi @jinserk , I was stuck at the same place, so I used Lasagne which I think is more extensible. By the way I recommend amaas's implementation instead of Lasagne's CTC, since the later one is somehow problematic
I tried it too, unfortunately, failed...
The following paper trained a convolutional bidirectional LSTM network to recognize natural scene texts without text line segmentation. The open source code implemented CTC in C++ for the Torch7 framework in Lua. The C++ code can be modified to use in Python.
[1] B. Shi, X. Bai, C. Yao. An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition. CoRR abs/1507.05717, 2015.
Baidu just released their open source CPU and GPU implementation of CTC here: https://github.com/baidu-research/warp-ctc
It is released as a C-library and bindings for Torch. The C library should be easy to integrate into many different projects.
@ekelsen thanks for the pointer!
Here is a implementation of Theano bindings for Baidu's warp-ctc: https://github.com/sherjilozair/ctc
Is there any plan for Keras to do bind this?
And from TensorFlow... https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/ctc
You guys have any luck with this implementation?
I maintain a repository of CTC with various implementations, including cython, numba/python and theano versions, check here: https://github.com/daweileng/Precise-CTC. You can use CTC_precise or CTC_for_train class, they're both fine for RNN training.
The CTC objective is different from the current objective functions in Keras, and requires different masking mechanism. I also maintain a repository of Keras MOD with CTC incorporated, check here : https://github.com/daweileng/keras_MOD. Currently, only train_on_batch() is modified to be compatible with CTC. This is enough for me, so there's no definite planning to modify other parts of Keras.
Oh this is perfect and exactly what I am interested in. Thank you!
Just to let you know, there is this discussion with version that wrap baidu version that could be faster:
https://github.com/Theano/Theano/issues/3871#issuecomment-207536539
There is 2 current wrapper version at:
https://github.com/mcf06/theano_ctc
and
https://github.com/sherjilozair/ctc
On Fri, May 13, 2016 at 1:41 AM, shantanudev notifications@github.com wrote:
Oh this is perfect and exactly what I am interested in. Thank you!
— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/fchollet/keras/issues/383#issuecomment-218956783
@daweileng Do you have any instructions/examples as to how to use your Keras MOD?
Under the repository https://github.com/daweileng/Precise-CTC, there is a folder named as 'Test', you can find a demo script 'mnist_ctc_v4.py' there.
@daweileng In mnist_ctc_v4.py you import from NN_auxiliary and from mytheano_utils. Can you share these too? Maybe in Keras MOD?
For who's interested: I updated my CTC-integrated Keras fork to base version 1.0.4, check here: https://github.com/daweileng/keras_MOD/tree/MOD_1.0.4. Till now, the following train/test functions work well with CTC cost:
@daweileng Sadly you did not fork the Keras repository. Instead you just copied the files over and added everything (including your patches) in one commit. Can you do that properly (e.g., press the fork button on github, clone, add your changes, commit separately, push) so your patches become actually visible? That'd be awesome.
@githubnemo As explained in the README, the reason I didn't make a pull request is that to avoid a mass modification of Keras' masking mechanism, currently I override sample_weights and masks variables of Keras. In theory this should not cause problem for other networks but I'm not 100% sure about this. Besides, the modification of fit() function is not done yet. I'd like to collect enough feedback before an official pull request to Keras master branch.
If you just want to know what are changed, you can compare contents of the two repositories.
Progress: Now FCN can work with LSTM + CTC!
See also #3436
@harikrishnavydana The ocr example runs fine for me on both Theano and Tensorflow.
If it is not working for you, please review the issue guidelines (update keras) and if the issue persists, open a new issue.
Thank you, i was using the older version of keras @patyork
Hi, thanks @patyork . Just to understand. You are putting text in images. And using some of these to train and others to validate? but you are using full words to train and not characters. Pycairo is a complicated library to install :)
I am trying to use keras ctc in Bidirectional LSTM i.e. https://github.com/lvapeab/ABiViRNet Network is as: https://pastebin.com/9QXbJSwE
Since loss function in keras uses 2 arguements, ctc_batch_cost uses 4. Can somebody tell me how to process it?
Apparently, there is a ctc_loss implementation in Keras. There's an open issue on Keras' ctc_batch_cost
in the tensorflow_backend.
Hello... We already have some sample of CTC at keras repository?
You mean this one? If so, yeah I know there is already a sample. It's just when I search for "Keras CTC" in google, this issue comes up and I just thought it would be nice to let people know that such an implementation already exists.
Great
Is it ok to use ctc_batch_cost
as keras loss function and pass it to model.compile
? All the losses that are implemented in:
https://github.com/keras-team/keras/blob/master/keras/losses.py
take only one sample. Is it efficient?
Is there any plan to integrate WarpCTC to Keras?
Could you please tell what input_length and label_length specify? As per the documentation it seems label_length contains the lengths of ground truth strings (in case of OCR). But I'm not sure what input_length means.
I think input_length refers to your sequence length and label_length refers to the ground truth label length.
Hi there,
Has anyone implemented a (Connectionist-Temporal-Classification)CTC loss with keras?
I attempt to add such a cost function in objectives.py file, based on rakeshvar's code. The model could be compiled, however, there are several errors when I do model.fit(). I am new to theano so it's really tough for me to debug...
It shouldn't be hard in theory, so I guess I made some "naive" mistakes...