Could you provide any explanation on using --duration_mult for audio-visual source separation? While --duration_mult 4 works well, --duration_mult 10 seems to have a worse result.
The program will report an error if I use --duration_mult 12.
If I only use --duration 20, the separated audio is almost the same as the source.
My goal is to do audio-visual separation for a 28 sec video.
Could you provide any explanation on using --duration_mult for audio-visual source separation? While --duration_mult 4 works well, --duration_mult 10 seems to have a worse result. The program will report an error if I use --duration_mult 12. If I only use --duration 20, the separated audio is almost the same as the source.
My goal is to do audio-visual separation for a 28 sec video.
Thanks!