lowerquality / gentle

gentle forced aligner
https://lowerquality.com/gentle/
MIT License
1.46k stars 296 forks source link

Retaining punctuation #100

Open Liontooth opened 8 years ago

Liontooth commented 8 years ago

Gentle works great! We have a feature request: to retain punctuation, treating each punctuation mark as a token. From this input:

john, are you hungry?

-- we'd like this sort of output, where the punctuation mark inherits the end time of the preceeding word:

john,john,6.74,6.92 ",",",",6.92,6.92 are,are,6.92,7.21 you,you,7.48,7.970000000000001 hungry,hungry,7.97,8.09 "?","?",8.09,8.09

Is this something anyone has looked at? Did you have a reason for omitting punctuation?

Cheers, David

strob commented 8 years ago

Punctuation is retained in the json file (through offsets). Though it's not a bad idea to do something like you're suggesting in the CSV. Would accept a pull request.

On Sun, Sep 4, 2016, 8:37 AM David Liontooth notifications@github.com wrote:

Gentle works great! We have a feature request: to retain punctuation, treating each punctuation mark as a token. From this input:

john, are you hungry?

-- we'd like this sort of output, where the punctuation mark inherits the end time of the preceeding word:

john,john,6.74,6.92 ",",",",6.92,6.92 are,are,6.92,7.21 you,you,7.48,7.970000000000001 hungry,hungry,7.97,8.09 "?","?",8.09,8.09

Is this something anyone has looked at? Did you have a reason for omitting punctuation?

Cheers, David

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lowerquality/gentle/issues/100, or mute the thread https://github.com/notifications/unsubscribe-auth/AAMup1kxK9a3TgguL6B5j7DQt1UTIRfbks5qmmc8gaJpZM4J0eG9 .