k2-fsa / k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.
https://k2-fsa.github.io/k2
Apache License 2.0
1.1k stars 214 forks source link

Training with custom CTC topology (with no blanks) #1222

Open desh2608 opened 1 year ago

desh2608 commented 1 year ago

I am trying to train an icefall model with phones as output units with a custom topology which resembles the "modified" CTC topology in k2, but without the blank symbol --> let's call this no-blank CTC. The idea is that instead of the "peaky" behavior that CTC shows, removing blank would force the phones to be better aligned with the acoustic frames.

I created the no-blank CTC topology, converted my texts into phone IDs, and then obtained the graph as follows:

transcript_fsa = k2.linear_fsa(token_ids, self.device)
transcript_fsa_with_self_loops = k2.arc_sort(
    k2.add_epsilon_self_loops(transcript_fsa)
)

res = k2.compose(
    self.ctc_topo,
    transcript_fsa_with_self_loops,
    treat_epsilons_specially=False,
)
res = k2.arc_sort(res)

Since I don't have a blank symbol, I created the nnet with only as many outputs as I have phone tokens. However, when I started training this with k2.ctc_loss(), I get infinity loss. This is not a training issue because I get this right at the start, i.e., when I compute the validation loss at the start. This suggests that the problem is with the arc scores in the composition most likely. On looking around in k2, I found the following: https://github.com/k2-fsa/k2/blob/42e92fdd4097adcfe9937b4d2df7736d227b8e85/k2/python/k2/autograd.py#L796

Why is the first column of the dense FSA always negative infinity? Also, if I want to train with such a topology, are there other changes which may be needed?

pkufool commented 1 year ago

Why is the first column of the dense FSA always negative infinity?

The first column is designed for final arc (label = -1) in k2 fsa, see

https://github.com/k2-fsa/k2/blob/42e92fdd4097adcfe9937b4d2df7736d227b8e85/k2/python/k2/dense_fsa_vec.py#L24-L41

GCQiu commented 3 weeks ago

hi, i am working this problem now, do you now how to train a ctc model without blank?

danpovey commented 3 weeks ago

You could maybe fake it by setting the blank logprob to -inf before the log_softmax, it may have the same effect.

On Mon, Jul 29, 2024, 8:23 PM GCQiu @.***> wrote:

hi, i am working this problem now, do you now how to train a ctc model without blank?

— Reply to this email directly, view it on GitHub https://github.com/k2-fsa/k2/issues/1222#issuecomment-2257383215, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLOYYUGRIGDNACDVHELTZO4BMVAVCNFSM6AAAAABLVO6DK2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJXGM4DGMRRGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>