moises-ai / moises-db

Moises Source Separation Public Dataset
106 stars 5 forks source link

How construct a 4-stem version #4

Open groadabike opened 12 months ago

groadabike commented 12 months ago

Hi,

I am trying to construct a 4-stem version of the dataset. How can you do that with the tools? Functions don't have documentation so it is hard to understand what is the correct use.

Thank you

igorgad commented 11 months ago

Hey @groadabike, you can create a 4-stem version of the dataset with the following code. I just puhed a fix for the mix_stems method, so you probably need to pull again.

from moisesdb.dataset import MoisesDB
from moisesdb.defaults import mix_4_stems

db = MoisesDB(
    data_path='./moisesdb',
    sample_rate=44100
)

for track in db:
    stems = track.mix_stems(mix_4_stems) 
    for stem_name, samples in stems.items():
        print(stem, samples.shape)

Let me know if it works for you. Thanks

groadabike commented 11 months ago

Dear @igorgad

Thank you so much for your help. The fix you pushed solved the problem.

Now I am facing other issues, for the reference mixture, I am adding together the resulting stems.

mixtures = np.zeros(stems[list(stems.keys())[0]].shape)
for stem_name, samples in stems.items():
    write_signal(
        out_track_path / f"{stem_name}.wav",
        samples.T,
        44100,
        floating_point=False,
    )
    mixtures += samples
write_signal(
    out_track_path / f"mixture.wav", mixtures.T, 44100, floating_point=False
)

I was able to save several tracks in this way. However, I got some errors in some tracks. The next is a list of the tracks and the error messages. Hope this helps you. Btw, I am saving the signals in 16-bit, the warning messages are tracks that were clipped before saving.

WARNING:root:Writing /media/gerardo/Extended_old/moisesdb_4_stems/rock/78ef22ce-472f-4f82-8656-16df73b9465f/mixture.wav. Signal out of range [-1.0, 1.0) - clipping.
Creating mixture 53808b95-cfe9-461d-a113-ffadf32817a1: operands could not be broadcast together with shapes (2,9084600) (2,9109504) (2,9084600) 
Creating mixture 6ceda40a-88bc-4e98-87c3-dd5c91725d41: operands could not be broadcast together with shapes (2,12789000) (2,12783487) (2,12789000) 
Creating mixture dfb0e076-cb6b-4dcc-9934-c60070ff04d7: operands could not be broadcast together with shapes (2,10220473) (2,10223677) (2,10220473) 
Creating mixture 8ba20549-c038-47c0-a808-e38741135911: operands could not be broadcast together with shapes (2,7497000) (2,7536640) (2,7497000) 
Creating mixture 35983cdd-3903-46cb-b184-f96274ced57b: operands could not be broadcast together with shapes (2,9417800) (2,9408000) (2,9417800) 
WARNING:root:Writing /media/gerardo/Extended_old/moisesdb_4_stems/singer_songwriter/5f04798d-c7be-4b8a-90bd-1fcd9946e875/vocals.wav. Signal out of range [-1.0, 1.0) - clipping.
WARNING:root:Writing /media/gerardo/Extended_old/moisesdb_4_stems/singer_songwriter/5f04798d-c7be-4b8a-90bd-1fcd9946e875/drums.wav. Signal out of range [-1.0, 1.0) - clipping.
WARNING:root:Writing /media/gerardo/Extended_old/moisesdb_4_stems/singer_songwriter/5f04798d-c7be-4b8a-90bd-1fcd9946e875/mixture.wav. Signal out of range [-1.0, 1.0) - clipping.
Creating mixture 88b545e5-4d06-4d55-a306-1bd3a2915ee5: operands could not be broadcast together with shapes (2,16140600) (2,16187392) (2,16140600) 
Creating mixture 8ce11544-9a6f-4f1e-ac2f-fc10343f15c8: operands could not be broadcast together with shapes (2,8226655) (2,8202600) (2,8226655) 
Running mix_stems ee082817-dbda-4fbf-b5aa-8dce2320ae35: min() arg is an empty sequence
Creating mixture 02ee37da-eea3-42b4-83bf-ab7f243afa13: operands could not be broadcast together with shapes (2,10213066) (2,10223616) (2,10213066) 
Creating mixture 3e41f238-7c48-4a42-ba70-5ee39824a844: operands could not be broadcast together with shapes (2,9172800) (2,9175040) (2,9172800) 
Creating mixture 89c515c9-5e93-4cb4-9806-20432d2d074d: operands could not be broadcast together with shapes (2,11466000) (2,11468800) (2,11466000) 
Creating mixture 174a115f-3688-45dc-8c39-9d05f21758e1: operands could not be broadcast together with shapes (2,10616832) (2,10584000) (2,10616832) 
Running mix_stems 46bc5393-7753-44ae-913b-bd5fa8f33e98: min() arg is an empty sequence
Creating mixture f4b735de-14b1-4091-a9ba-c8b30c0740a7: operands could not be broadcast together with shapes (2,12744900) (2,12779520) (2,12744900) 
Running mix_stems b92cb1ca-baa9-4c74-b6dc-36389671ed76: min() arg is an empty sequence
Creating mixture d8f0e410-5761-4d4a-9000-effe11089bbd: operands could not be broadcast together with shapes (2,13141800) (2,13047518) (2,13141800) 
Creating mixture 4857878a-e44b-4143-90e9-b65d0b704306: operands could not be broadcast together with shapes (2,12921300) (2,12976128) (2,12921300) 
Creating mixture f40ffd10-4e8b-41e6-bd8a-971929ca9138: operands could not be broadcast together with shapes (2,9525600) (2,9568256) (2,9525600) 
Creating mixture 6cd44645-ed19-4ecc-a57c-58d400005b29: operands could not be broadcast together with shapes (2,5701632) (2,5644800) (2,5701632) 
Creating mixture 1fc37390-1769-452d-9bea-19025be4c467: operands could not be broadcast together with shapes (2,9306112) (2,9283050) (2,9306112) 
Creating mixture 4b9f86f4-23e4-458b-839e-8a63b584bea3: operands could not be broadcast together with shapes (2,8731800) (2,8781824) (2,8731800) 
Creating mixture 3e7985e5-408f-4cf8-92b9-b9f62f738dd3: operands could not be broadcast together with shapes (2,10053005) (2,10092544) (2,10053005) 
Creating mixture bacbb01f-b877-4d62-8050-992f1d85543a: operands could not be broadcast together with shapes (2,6619136) (2,6594347) (2,6619136) 
WARNING:root:Writing /media/gerardo/Extended_old/moisesdb_4_stems/singer_songwriter/0f5fb60c-51d4-4618-871d-650c9e927b79/drums.wav. Signal out of range [-1.0, 1.0) - clipping.
WARNING:root:Writing /media/gerardo/Extended_old/moisesdb_4_stems/singer_songwriter/0f5fb60c-51d4-4618-871d-650c9e927b79/mixture.wav. Signal out of range [-1.0, 1.0) - clipping.
Creating mixture 491c1ff5-1e7b-4046-8029-a82d4a8aefb4: operands could not be broadcast together with shapes (2,4542300) (2,11201400) (2,4542300) 
Creating mixture 763641c7-488f-4959-a554-fdbce9582644: operands could not be broadcast together with shapes (2,7585200) (2,7602176) (2,7585200) 
Creating mixture 35a19148-49bf-451d-9a0e-5ab8e914c367: operands could not be broadcast together with shapes (2,8114400) (2,8126464) (2,8114400) 
Running mix_stems 0358fd1e-244a-4422-9a42-29b5d68f6e4b: min() arg is an empty sequence
Creating mixture dbb07bdc-2706-4e67-8b59-43f98cf1608a: operands could not be broadcast together with shapes (2,9084600) (2,9108347) (2,9084600) 
Creating mixture 11845abc-8ca3-4fb2-bd84-521aeeff56f4: operands could not be broadcast together with shapes (2,9449164) (2,10111795) (2,9449164) 
Running mix_stems 7dd515b0-e218-425d-b8bf-a75056237d6a: min() arg is an empty sequence

There are cases where the shape of the stems is very different like track id 491c1ff5-1e7b-4046-8029-a82d4a8aefb4

Just for completeness, the write_signal function I am using is: https://github.com/claritychallenge/clarity/blob/dd866c1e959b454802fc29314c1428a314187e20/clarity/utils/file_io.py#L48C1-L93C6

igorgad commented 11 months ago

Hey @groadabike. Sorry for the late reply. It is expected to have some differences in the length of stems. The way we solve this is by trimming before summing, as the following code snippet.

def trim_and_mix(sources):
    min_len = min(s.shape[-1] for s in sources)
    return np.stack([s[..., :min_len] for s in sources]).sum(0)

Best,.