ottokart / punctuator2

A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text
http://bark.phon.ioc.ee/punctuator
MIT License
657 stars 195 forks source link

How to add words to exclude them from punctuation? #55

Closed sangeet2020 closed 4 years ago

sangeet2020 commented 4 years ago

While I have succeeded in implementing your segmenter. How do I exclude some words from segmentation and punctuation? Fo example: If I say librispeech, I want it to be punctuated as LibriSpeech and nothing else-- is there any way where I can add this hardcoded list of such words?

sangeet2020 commented 4 years ago

I was wondering if you had a chance to read my query above. @ottokart Any clues on this? Thank You

ottokart commented 4 years ago

Hi!

librispeech -> LibriSpeech is a capitalization/truecasing problem and punctuator does not do that. To use a hardcoded list you can easily post-process the results with a custom script (e.g., split the output into words and check if they exist in a predefined dictionary and replace if that's the case).

sangeet2020 commented 4 years ago

Thanks a lot for your answer. I am closing this issue.