kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.
http://kaldi-asr.org
Other
14.11k stars 5.31k forks source link

Assertion failed : You cannot call BestPathEnd if no frames were decoded #4816

Closed Tamaya31 closed 1 year ago

Tamaya31 commented 1 year ago

Hello everyone,

I am new to Kaldi, and I have been trying to get this existing script from the Gentle repository to work. I get an assertion failed line 223 when calling GetBestPath as described in the title of this issue. I test this by first pushing chunks of data (line 185) then calling "get-final" (line 217)

AdvanceDecoding is called after pushing each chunk of audio data, but decoder.NumFramesDecoded() always returns 0.

I've built Kaldi with MKL 2022.2.1 and OpenFST latest, on Windows 11.

Any idea about what could be happening here ?

Thanks in advance!

galv commented 1 year ago

I don't know the gentle codebase, but how large is the audio chunk that you are pushing?

Normally, each frame is 10ms in length. At 16kHz, that's 160 samples. Are you sending at least 160 samples of data? Depending on the model, if 3x frame downsampling is used, byou might need 480 samples instead.

This is just my first hunch. I didn't look at the code.

On Mon, Jan 2, 2023 at 8:27 PM Tamaya31 @.***> wrote:

Hello everyone,

I am new to Kaldi, and I have been trying to get this existing script https://github.com/lowerquality/gentle/blob/master/ext/k3.cc from the Gentle repository to work. I get an assertion failed line 223 when calling GetBestPath as described in the title of this issue. I test this by first pushing chunks of data (line 185) then calling "get-final" (line 217)

AdvanceDecoding is called after pushing each chunk of audio data, but decoder.NumFramesDecoded() always returns 0.

I've built Kaldi with MKL 2022.2.1 and OpenFST latest, on Windows 11.

Any idea about what could be happening here ?

Thanks in advance!

— Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/4816, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEL6UA2PELJO7YCK4KYUADWQOTDXANCNFSM6AAAAAATPJZWXI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Daniel Galvez http://danielgalvez.me https://github.com/galv

Tamaya31 commented 1 year ago

I was actually pushing the entire file (because the framerate was in Hz, the audio chunk was 160000). Now using 160 works (well it doesn't output any words/phones but the decoded frame number increases, so that is another problem not related to this issue). Thank you for pointing me in the right direction!