About duet and mixtures video

I evaluate the trained model performance by the trained model weights u provided. I find that the trained model use the Mix-and-Seperate process and finally restruct the two audios by inputing two solo videos,. This is a validation part. And how about the Test part about duet video?
I am interested in research on sound source localization and separation of natural duo videos. Should I train the model from scratch？ Or could I still use the trained model u provided？ Could u give me some suggestions please? Thank u~ I'm looking forward to your reply.

hangzhaomit / Sound-of-Pixels

About duet and mixtures video #13