uberduck-ai / uberduck-ml-dev

ML models for Uberduck
Apache License 2.0
377 stars 61 forks source link

Speaker encoder and other cleanups #121

Closed sjkoelle closed 1 year ago

sjkoelle commented 1 year ago

This PR enables use of the speechbrain voxceleb-based speaker encoder as an initializer for multispeaker models. This will hopefully enable training of larger datasets. There is some ambiguity in the changes between "audio_encoder" and "speaker_encoder" (i.e. mean audio encoder or randomly initialized speaker encoder) that can be resolved in the future. It also removes a fair amount of unused model and trainer code. It also adds exception catching for the speaker_id.