abb128 / LiveCaptions

Linux Desktop application that provides live captioning
GNU General Public License v3.0
1.15k stars 31 forks source link

Punctuation model accuracy #48

Open abb128 opened 1 year ago

abb128 commented 1 year ago

Post here any feedback about the new punctuation model, experiences using it, comparisons to the old v0 model, etc

Known issues:

ScottNAtlanta commented 1 year ago

I don't know why, but this seems to want to add the word "dude" at the end of some of my sentences...I am in the US Southeast and do have somewhat of an accent, but I don't say "dude" at the end of my sentences (although it is kinda funny

zyansheep commented 1 year ago

Would it make sense to re-do byte-pair encoding for the model on new data to figure out what tokens to add?

whitequark commented 1 year ago

Post here any feedback about the new punctuation model, experiences using it, comparisons to the old v0 model, etc

Note: I have not used the old model, only the new one.

Absolutely fantastic. I am now routinely using this model on calls and meetings. It's not quite as good as Google's captioning model for Youtube (though I think my comparison is biased in that youtubers are using professional mics and speak clearly--the ones I watch, anyway--and people in meetings are using garbage quality laptop mics and talk while vaping, etc), but it's considerably better than Google Meet's autocaptioning.

The model and the application could be obviously improved (I'll file some issues for the latter), but this is a major lifesaver for someone who occasionally has severe sensory processing issues due to fibromyalgia.

Summertime commented 1 year ago

Repeated words usually end up with the wrong count, e.g. 4 blahs spoken appearing as 5 blahs in the transcript