Investigate whisper's behavior on filtering out filter non-words

clamsproject / app-whisper-wrapper

Apache License 2.0

0 stars 0 forks source link

Investigate whisper's behavior on filtering out filter non-words #4

Closed keighrim closed 3 months ago

keighrim commented 1 year ago

The model under the whisper is known to filter/remove dysfluency and filler words during decoding. We'd like to investigate how good the filtering is and how damaging it is in terms of literal/verbatim transcription.

Elements to look for

filler workds (uh, ah, ...)
"re-start" of words or sentences
voice over music/noice

(not limited to)

mrharpo commented 1 year ago

According to https://github.com/openai/whisper/discussions/286

They are not suppressed by default.

The heuristics do not suppress disfluencies by default. It is likely the result of training on data that, presumably, has little to no disfluencies in its transcripts. Disfluencies are in the vocab, so you can try to give it a prompt with the disfluencies you want it to predict.

keighrim commented 3 months ago

closing as wont-fix.