aim-qmul / sdx23-aimless

Source Separation training codebase for the Sound Demixing Challenge 2023.
37 stars 0 forks source link

[WIP] Integrate audio effects #29

Closed csteinmetz1 closed 1 year ago

csteinmetz1 commented 1 year ago

This still a bit rough but operational. Need some advice on the desired way to integrate with the existing transforms. There are many parameters that could be exposed, but my thinking was to fine-tune some of the values and set them as defaults in each effect class. Then we use a main class RandomAudioEffectChannel that will be used as the main interface.

yoyolicoris commented 1 year ago

I think merging multiple effects into RandomAudioEffectChannel is OK, but you lose some flexibility, such as altering the order of the effects. In addition, the functionality of effects overlaps with transforms. I would suggest reusing the CPUBase interface and putting them into transforms, similar to LimitAug.

csteinmetz1 commented 1 year ago

but you lose some flexibility, such as altering the order of the effects.

Agreed. But I don't think we will need it. We really just need to fine-tune the pipeline once and leave it.

I would suggest reusing the CPUBase interface and putting them into transforms, similar to LimitAug.

The reason I didn't do this is I don't think it makes as much sense to assume the input to the transforms will be both x and y. In the case of creating the mix ideally we want to call each transform/effect on each stems individually so the parameters are randomized and I wanted to avoid putting the loop in each of the transforms/effects. I thought it made more sense to perform the loop in the __getitem__() and then create a mixture after apply all effects to all stems.

If we want to go that way, maybe the effects need their own base class that implements this looping so its not implemented individually in each effect.

yoyolicoris commented 1 year ago

Agreed. But I don't think we will need it. We really just need to fine-tune the pipeline once and leave it.

That's fair.

The reason I didn't do this is I don't think it makes as much sense to assume the input to the transforms will be both x and y. In the case of creating the mix ideally we want to call each transform/effect on each stems individually so the parameters are randomized and I wanted to avoid putting the loop in each of the transforms/effects. I thought it made more sense to perform the loop in the __getitem__() and then create a mixture after apply all effects to all stems.

I got your point. Actually, RandomSwapLR and RandomFlipPhase also have a for-loop inside so it's worth extracting this logic. That's trivial. We can add a for-loop on top of this line: https://github.com/yoyololicon/mdx23-aim-playground/blob/8bfbd556d7f122f8fe6616abf2fe7a39f608f8e4/augment/cpu.py#L26 then _transform can treat each track separately. Regarding whether the loop should be inside each transform or in __getitem__, if there are no performance-related reasons, I prefer the former, which is simpler and looks more functional (considering other dataset classes will have to implement their loop.)

csteinmetz1 commented 1 year ago

Can you take a look @yoyololicon ?

csteinmetz1 commented 1 year ago

Are you happy to merge this now?

yoyolicoris commented 1 year ago

Nice. Thank you for applying these suggestions. It's a great PR.