nikvaessen / w2v2-speaker-few-samples

Research code for the paper "Training speaker recognition systems with limited data" at https://arxiv.org/abs/2203.14688
MIT License
11 stars 2 forks source link

How do i pass a new .wav sample for prediction? #3

Open kprateek-iitbh opened 1 week ago

kprateek-iitbh commented 1 week ago

It seems from the code, the datasets have to be in a .tar.gz format for the train/validation/test to work. We need to pass the data as a datamodule as seen in main.py line 378 "result = trainer.test(network, datamodule=dm)". Is it possible to just load the .ckpt model from memory and perform a direct prediction on a new .wav file ? Or is the only way to test recording the files and preparing a .tar.gz evaluation shard?

nikvaessen commented 4 days ago

It should be possible, but you would need to create a script which manually imports the network class, and then load the network from the checkpoint.