Closed prinshul closed 4 months ago
Hi,
Can you please let me know the train and test speakers?
Thank you for your interest in our work.
Amuse is trained on specific takes from designated actors. The actor IDs can be found in the data module.
[“1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “12”, “13”, “16”, “18”, “21”, “26”, “27”, “30”]
Please check the code for the specific takes used for the actors during training. The total duration of the training data is ~5 hours, as indicated in the shared data MDB file.
Regarding test time: The perceptual study was explicitly conducted on takes from test actors. The editing application uses takes that might originate from the test actors and potentially other sources, including the held-out takes of the train actors.
Thank you.
Is it possible to run with multiple GPUs?
Yes, multiple GPU support is implemented for the audio model. However, it was not used in the final version. For more details, please refer to the updated README.
On what speakers training done? What are test speakers? The 22 speakers from BEAT dataset used both for train and test?