I am trying to train a vocal separation model using this dataset, but when I try to output the mixture and the vocal waves, I found out that some songs output wave pairs in different shape
I use the following code the generate the wave pairs:
for track in db:
if 'vocals' in track.stems:
output_stems = {
"mixture": all_stems,
"vocals": ["vocals"],
}
waves = track.mix_stems(output_stems)
if waves['mixture'].shape != waves['vocals'].shape:
print(f"{track.artist} - {track.name}")
print(track.id)
print(waves['mixture'].shape, waves['vocals'].shape)
And the output is as follows:
Holy Magick - Wake Up So
8ce11544-9a6f-4f1e-ac2f-fc10343f15c8
(2, 8202600) (2, 8226655)
Fake Eyes - I'm A Mess
bacbb01f-b877-4d62-8050-992f1d85543a
(2, 6594347) (2, 6619136)
Frank Sermon - 4 Sets
6cd44645-ed19-4ecc-a57c-58d400005b29
(2, 5644800) (2, 5701632)
Firefly - Golden Times
1fc37390-1769-452d-9bea-19025be4c467
(2, 9283050) (2, 9306112)
ProRata - Broken
35983cdd-3903-46cb-b184-f96274ced57b
(2, 9408000) (2, 9417800)
Firefly - Aarons House
174a115f-3688-45dc-8c39-9d05f21758e1
(2, 10584000) (2, 10616832)
Holy Magick - Lifeboat
6ceda40a-88bc-4e98-87c3-dd5c91725d41
(2, 12783487) (2, 12789000)
Battlestar - This Town
d8f0e410-5761-4d4a-9000-effe11089bbd
(2, 13047518) (2, 13141800)
Is this the intended outcome? Since trim_and_mix is called inside mix_stems I expect the output pairs should have the same shape
I am trying to train a vocal separation model using this dataset, but when I try to output the mixture and the vocal waves, I found out that some songs output wave pairs in different shape
I use the following code the generate the wave pairs:
And the output is as follows:
Is this the intended outcome? Since
trim_and_mix
is called insidemix_stems
I expect the output pairs should have the same shape