persephone-tools / persephone

A tool for automatic phoneme transcription
Apache License 2.0
156 stars 26 forks source link

adding word boundaries to the acoustic model for Na #210

Open alexis-michaud opened 6 years ago

alexis-michaud commented 6 years ago

Since 2018, the model for Na includes tone-group boundaries. But up till now (Oct. 2018), the model for Na still disregards word boundaries. A look at story-fold cross-validation materials suggests that longer words have somewhat different acoustic properties. So there could be value for phoneme & tone recognition in adding word boundaries to the training.

A first step (suggested by @oadams ) could be to produce separate error rates for short words versus longer words by using the word segmentation in the reference transcription as a guide.

(Suggested label for this Issue: Yongning Na)

alexis-michaud commented 5 years ago

This relates to #214, in that the word boundary in the training corpus is a space.

"it's important that if users want to explictly predict spaces (in character prediction), then that is accounted for. Probably best with a flag to segment_into_chars() or something similar, which would generate special tokens that represent spaces, such as underscores, for training and decoding. These then would get removed as a postprocessing step."