kaixxx / noScribe

Cutting edge AI technology for automated audio transcription. A nice GUI for OpenAIs Whisper and pyannote (speaker identification)
GNU General Public License v3.0
487 stars 101 forks source link

Visualise transcription confidence with sharded color highlights to speed up transcript review #87

Open menelic opened 1 month ago

menelic commented 1 month ago

Whisper produces confidence estimates for chunks, ie words or phrases, of the transcribed text. Please visualise these in the editor by producing html text highlights, for example in different gradients of white through yellow to dark orange. Such a visual aide would greatly speed up transcription review.

thanks a lot for this great tool!

kaixxx commented 1 month ago

I did implement this in the very first versions of noScribe that used MS Word as an editor. I have since moved away from Word and ditched the support for these confidence markers in the process. Besides some technical difficulties, the main reason was that these markers weren't very helpful to begin with. Here is an example transcribed with noScribe 0.3. Everything below and up to a confidence level of 3 is marked in red:

grafik

I found that checking all these small red bits one by one is so much work that we might as well go through the whole file once while listening to the original audio. I've tried to make this as easy as possible in the noScribe editor. You can even speed up the audio if you want. This is also much safer, since sections with a high confidence level might still contain errors.