Hi guys!
I'm doing a music inpainting problem, my input is a vocal track, my task is to generate a music segment corresponding to that vocal, so my dataset includes 2 parts, 1 is vocal, 2 is instrument. So what will be the dataset for the Semantic, Coarse, Fine phase in the two vocal and instrumental episodes?
Hi guys! I'm doing a music inpainting problem, my input is a vocal track, my task is to generate a music segment corresponding to that vocal, so my dataset includes 2 parts, 1 is vocal, 2 is instrument. So what will be the dataset for the Semantic, Coarse, Fine phase in the two vocal and instrumental episodes?