Closed homink closed 6 years ago
I guess you are hitting out of range for the speaker embedding table. Can you make sure you have 119 speakers in the dataset? The following command should give 118, but I'm guessing you will get larger value than 118.
cat data/vctk/train.txt | cut -d "|" -f 5 | uniq | awk '{if(m<$1) m=$1} END{print m}'
The next command should give 119
.
cat data/vctk/train.txt | cut -d "|" -f 5 | uniq | wc -l
Your guessing came true. The speaker indexing in train.txt should be in a form of incremental integer from 0 for working. Thanks!
Hi again,
I trained single Korean speaker successfully and moving to multiple Korean speaker. Again, I encountered such Assertion error as shown below. I tracked down and looks like self.encoder in AttentionSeq2Seq class gave such error messages. Could you let me know where the following self.encoder function is defined so that I can look into further? max_position doesn't work this time.
encoder_outputs = self.encoder( text_sequences, lengths=input_lengths, speaker_embed=speaker_embed)
Thanks in advance,