Closed shirayu closed 2 years ago
I tried with several -n
values. In all cases but one, nothing is output to console. In only one try with-n =1 0
I could had something transcribed. First result is in a wrong language; second time it was the right transcription. I was not able to replicate it, though, i.e. I normally do not get any transcription.
[2022-09-23 20:29:26,345] transcriber._deal_timestamp DEBUG -> Length of consecutive: 0
0.00->2.00 無限に
[2022-09-23 20:29:26,347] transcriber._deal_timestamp DEBUG -> Length of buffer: 0
[2022-09-23 20:29:26,347] transcriber.transcribe DEBUG -> Last rest_start=None
[2022-09-23 20:29:26,349] cli.transcribe_from_mic DEBUG -> Segment: 1
[2022-09-23 20:29:26,353] transcriber.transcribe DEBUG -> seek=0, timestamp=2.0, rest_start=None
[2022-09-23 20:29:32,840] transcriber.transcribe DEBUG -> Result: temperature=0.00, no_speech_prob=0.24, avg_logprob=-0.80
[2022-09-23 20:29:32,840] transcriber._deal_timestamp DEBUG -> Length of consecutive: 0
2.00->4.00 It is okay.
[2022-09-23 20:29:32,840] transcriber._deal_timestamp DEBUG -> Length of buffer: 0
[2022-09-23 20:29:32,840] transcriber.transcribe DEBUG -> Last rest_start=None
[2022-09-23 20:29:32,843] cli.transcribe_from_mic DEBUG -> Segment: 2
[2022-09-23 20:29:32,846] transcriber.transcribe DEBUG -> seek=0, timestamp=4.0, rest_start=None
@fantinuoli Did you set proper value to --language
?
if the language is English, you need to set --language en
like this.
poetry run whisper_streaming --language en --model base -n 20
I added the instruction about that in README. (e9e286d)
I also found a bug about --lanauge
!
I fixed it at 9cd80ab.
pad_or_trim
return torch.Size([1, 80, 3000])
.
While speaking, padding is not expected.
When -n 160
, torch.Size([1, 80, 3000])
.
So, 160 or larger is expected.
Too small
-n
makes no response, while too large value consumes memory. Set proper value to-n
and wake warning for too small value.