Thanks for the great paper. I am trying to use the pre-trained model but my results are not great. Can you please suggest on the prerequisite(like video quality, audio quality, sampling rate). I am working on recorded videos with only two speakers in it.
Thanks for the great paper. I am trying to use the pre-trained model but my results are not great. Can you please suggest on the prerequisite(like video quality, audio quality, sampling rate). I am working on recorded videos with only two speakers in it.