chore(readme): add datasets detail in readme

alfonks commented 1 year ago

For now, the scale isn't specified. I'm using audacity to export the audio file, by default the scale it uses is Mel and it turns out the tensor padding wont work. When I try to check the tensor shape, it became like this

Meanwhile the provided data from repository shape's like this

For this case, I use Audacity to export the .wav file, after I turn on the multi-view settings on the audio file and change the scale's spectrogram settings for that audio to Linear I am able to achieve similar shape like the provided audio file from repository.

SayaSS commented 1 year ago

Thank you for your help! But the scale of spectrogram views in Audacity is just a different observation scale and does not change the file itself. You just need to convert the audio to mono. 20230405132418

alfonks commented 1 year ago

ok, noted i guess this isn't needed then. thank you!

SayaSS / vits-finetuning

chore(readme): add datasets detail in readme #26