Closed 83344rushikesh closed 4 years ago
The current model is frame-based (no motion) so it cannot handle instance separation such as speech separation. It would be interesting future work to incorporate motion to separate speech for multiple speakers.
@rhgao Will it work on videos which contains multiple speakers to isolate them in multiple files equal to the number of the speakers present in the videos?