I was wondering whether this is intentional or not, but it seems the utterances from the file_list are shuffled before loading. I'm talking about these lines. It makes selecting an audio file for style transfer difficult, so I was wondering if there is a reason for this.
Ok, that was actually a stupid question because you are also using the loader for training. My bad. But maybe it would make more sense to make it optional with a flag during style transfer?
I was wondering whether this is intentional or not, but it seems the utterances from the file_list are shuffled before loading. I'm talking about these lines. It makes selecting an audio file for style transfer difficult, so I was wondering if there is a reason for this.