Teacher forcing on TIMIT and GRID dataset

hjzzju commented 3 years ago

Hi, I want to know how to set teacher forcing in GRID and TCDTIMIT dataset. The same as lip2wav dataset? teacher forcing decay from 29000 steps?

Rudrabha commented 3 years ago

You can decay it earlier. Start from 1000 steps or something similar. You may decay within 10,000 steps and then let it train without teacher forcing for sometime.

Domhnall-Liopa commented 2 years ago

Hi,

With tacotron_teacher_forcing_mode="constant" during training, the teacher forcing ratio is never decayed and always stays at 1. Then in synthesizer/models/helpers.py the following code is used to select the groundtruth or output of the previous time-step:

next_inputs = tf.cond(
      tf.less(tf.random_uniform([], minval=0, maxval=1, dtype=tf.float32), self._ratio),
      lambda: self._targets[:, time, :],
      lambda: outputs[:,-self._output_dim:])

Since the ratio is always 1 and never decayed, the decoder is just passed in the groundtruth of the previous time-step for the entire training. Is this expected? Should there be a switch at some point so the outputs of the previous time-step are passed during training?

Rudrabha / Lip2Wav

Teacher forcing on TIMIT and GRID dataset #29