Closed ziyu123 closed 1 year ago
I force to set logits[:,2] += -999999.0 when decoding.
I force to set logits[:,2] += -999999.0 when decoding.
Does it fix your issue?
If yes, you can modify the code to treat unk_id
as blank_id
.
If yes, you can modify the code to treat
unk_id
asblank_id
.
thanks a lot, set logits[:,2] solved my problem, if treat unk_id
as blank_id
, I think it's also ok.
We are treating unk_id as blank_id: https://github.com/k2-fsa/icefall/blob/00256a766921dd34a267012b0e2b8ff7d538f0e6/egs/librispeech/ASR/pruned_transducer_stateless2/beam_search.py#L631
Which decoding method and which file are you using?
Maybe some training utts were all unk due to some encoding issue or language mismatch?
On Thursday, August 10, 2023, Fangjun Kuang @.***> wrote:
We are treating unk_id as blank_id: https://github.com/k2-fsa/icefall/blob/00256a766921dd34a267012b0e2b8f f7d538f0e6/egs/librispeech/ASR/prunedtransducer stateless2/beam_search.py#L631
https://github.com/k2-fsa/icefall/blob/00256a766921dd34a267012b0e2b8f f7d538f0e6/egs/librispeech/ASR/prunedtransducer stateless2/beam_search.py#L746
Which decoding method and which file are you using?
— Reply to this email directly, view it on GitHub https://github.com/k2-fsa/icefall/issues/1209#issuecomment-1673046311, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLOYPUADII5IVGUMSEXTXUTBBHANCNFSM6AAAAAA3K33LOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Maybe some training utts were all unk due to some encoding issue or language mismatch?
i‘m very sorry, IT'S MY TRAIN DATA ERROR, some utts in my train text are lowercase, The lowercase English letters cause the sentence to be all
I am training an english zipformer2 model, but some audio decode are all, like this
000086 ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ' ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ' ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇
When I concat the above audio together, it will output normally and When I trim the previous few seconds of audio, some will output normally and some will not.
What is the cause of this?