ottokart / punctuator2

A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text
http://bark.phon.ioc.ee/punctuator
MIT License
657 stars 195 forks source link

Relations between data in first and second stages #52

Open anavc94 opened 4 years ago

anavc94 commented 4 years ago

Hello,

I have a question about how to train a first and second stages. I have a lot of plain texts to train the first stage, but only a few part of them are annotated with word pause durations.

What is the best approach I should do?

  1. Train a first stage with all plain texts, and then train the second one with the part of them with pause annotations.

  2. Train a first stage with all plain texts except the pause annotated ones, and then use them only in the second stage.

Thank you so much!