Open crankyz opened 5 years ago
Since I tried audio-only model first and then tried audio-visual model later, the functions were not written together. I think the idea to combine them is also applicable, just crop the segment of video first and use different functions to handle audio and image parts.
subj