Open saattrupdan opened 1 year ago
Thanks for the pointer @saattrupdan!
I've checked out punctfix
a bit and it indeed works very well! However it has some short-coming that I hope we can address by also taking the audio into consideration.
For phrases that are not clearly stated as a question, such as "we are leaving in 5 minutes no", punctfix
cannot predict this as a question, simply because one needs to hear the audio for this. All of the following are valid solutions:
- "We are leaving in 5 minutes! No!"
- "We are leaving in 5 minutes. No."
- "We are leaving in 5 minutes, no?"
For this example punctfix
gives:
>>> from punctfix import PunctFixer
>>> model = PunctFixer(language="en")
>>> example_text = "we are leaving in 5 minutes no"
>>> print(model.punctuate(example_text))
We are leaving in 5 minutes No!
which really cannot always be correct depending on the audio.
Also I noticed some problems with the apostrophe: https://github.com/danspeech/punctfix/issues/13
I know this project is at an early stage, but I just want to flag an alternative approach to punctuation restoration. It's a package called
punctfix
, and can be found here (I'm not a contributor to that package). Rather than using Whisper models, they use a NER approach, and works really well and super fast.