uberduck-ai / uberduck-ml-dev

ML models for Uberduck
Apache License 2.0
378 stars 61 forks source link

Tacotron multispeaker training orders speakers unreliably #81

Closed Sobsz closed 2 years ago

Sobsz commented 2 years ago

No idea what could be causing it, but it seems to rely on speaker count: an 8-speaker model of mine had the inference IDs match the training ones, whereas a 20-speaker one is jumbled up. dhama the llama on Discord has been able to (eventually) find all the training voices in their 61-speaker model, so it's likely that the IDs are only being shuffled and not discarded.

johnpaulbin commented 2 years ago

bump

sjkoelle commented 2 years ago

Will take a look.

johnpaulbin commented 2 years ago

New information that might conclude that IDs are being treated as strings inside the trainer:

Here is a list LeifEricson put together inside discord:

image

This is obviously incorrect sorting. This sorting is also consistent when sorting strings:

image

e.g "10" is where "2" should be, etc.

sjkoelle commented 2 years ago

good catch. just merged a fix that i think should work.