Closed keighrim closed 3 months ago
According to https://github.com/openai/whisper/discussions/286
They are not suppressed by default.
The heuristics do not suppress disfluencies by default. It is likely the result of training on data that, presumably, has little to no disfluencies in its transcripts. Disfluencies are in the vocab, so you can try to give it a
prompt
with the disfluencies you want it to predict.
closing as wont-fix.
The model under the whisper is known to filter/remove dysfluency and filler words during decoding. We'd like to investigate how good the filtering is and how damaging it is in terms of literal/verbatim transcription.
Elements to look for
(not limited to)