It's unclear from the lack of comments in run.sh and the lack of a read-me file in the fine-tuning folder how to set up the Arctic data or other voice-data for fine-tuning training. The order seems clear get the embedding, train using the embedding, and then run inference on unseen source voice recordings. Would it be possible to create a read-me for fine-tuning to describe the data setup and process in more detail?
It's unclear from the lack of comments in run.sh and the lack of a read-me file in the fine-tuning folder how to set up the Arctic data or other voice-data for fine-tuning training. The order seems clear get the embedding, train using the embedding, and then run inference on unseen source voice recordings. Would it be possible to create a read-me for fine-tuning to describe the data setup and process in more detail?