Closed LEECHOONGHO closed 2 years ago
Hi,
With 99% reliable data, STC may not improve the results greatly over a CTC baseline. You would have to experiment and see. Using low p_0 and p_max values as you suggested makes sense.
We have used Adam optimizer for Handwriting Recognition results reported in the paper. I do not see any issues with using Adam.
Note that STC only corrects deletion errors in the dataset. While noisy data can usually contain deletion, insertion, and substitution errors.
If you have a metric to track confidence of each word in the pseudo label, an option could be to keep only high confidence words in the pseudo label and remove the rest. This could be considered as a partial label and the STC model should help here.
Another option to deal with noisy labels is described in https://arxiv.org/abs/2010.15653 where multiple pseudo labels are used per sample. This can also be easily implemented in the GTN framework by carefully modifying the label graph of CTC.
Hope this helps !
Thank you for your advice!
It helped me a lot!
Hello, I'm trying to apply STC for my ASR model training.
Before proceeding with the training, I have a question to ask about STC training mentioned in STC paper [1] If anyone has experimented with the case I posited, please give me advise.
Thank you.
[1] Star Temporal Classification: Sequence Classification with Partially Labeled Data.