I am training a new Chinese model for Piper because the pronunciation tone of the Piper project's Chinese model is incorrect. I have prepared a larger dataset. But what puzzled me was that Piper's training documentation mentioned that the CSV file of the dataset only has one text column, and I used the dataset to contain the text corresponding to the audio, as well as the corresponding pinyin and tone. How should I handle this CSV file to meet Piper's requirements for the dataset
I hope someone can give me a hint. Thanks
The training guide mentions that CSV format should be:
id|text
My dataset is in CSV format:
ID|text|prosody
If there is no 'prosody', Will it lead to non-standard pronunciation?
I am training a new Chinese model for Piper because the pronunciation tone of the Piper project's Chinese model is incorrect. I have prepared a larger dataset. But what puzzled me was that Piper's training documentation mentioned that the CSV file of the dataset only has one text column, and I used the dataset to contain the text corresponding to the audio, as well as the corresponding pinyin and tone. How should I handle this CSV file to meet Piper's requirements for the dataset I hope someone can give me a hint. Thanks The training guide mentions that CSV format should be:
My dataset is in CSV format:
If there is no 'prosody', Will it lead to non-standard pronunciation?