Bartzi / see

Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"
GNU General Public License v3.0
575 stars 147 forks source link

ctc loss #102

Closed wdon021 closed 3 years ago

wdon021 commented 3 years ago

Hi @Bartzi, I see the repository has a ctc_metrics.py, and in the Recognition map there is a comment

go 2x num_labels plus 1 timesteps because of ctc loss

However, in train_fsns.py I don't get to see ctc loss gets to apply anywhere in the metric. Can you please help me with the question?

thank you very much.

Bartzi commented 3 years ago

We experimented with CTC-Loss, but in the end I think we only used Softmax loss. You could set the code to use ctc metrics as loss and everyhting will be trained using CTC Loss (this file)

wdon021 commented 3 years ago

thank you @Bartzi for your reply,

I was trying to use that CTC Loss file, but run into

x need to be in a Variable List problem

The current input seems to be just a Variable type. is there a quick method to adopt this CTC loss? Or it will require some amount of work?

thank you again

Bartzi commented 3 years ago

My guess is that it should not be that difficult. But I did not use this code in 3 years. I think you will need to adapt the number of labels. In this line the code produces a prediction for each character. If you want to use CTC you'll have to do this 2 * self.num_labels + 1 times for ctc loss. You should not pay any attention to the comment in the file, I guess this is legacy :sweat_smile:.

wdon021 commented 3 years ago

@Bartzi Thank you, got it working again with some minor changes.