lucidrains / BS-RoFormer

Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs
MIT License
384 stars 13 forks source link

Multi-stem tensor fix, input length=output length option, MLP depth adjustment #23

Closed crlandsc closed 9 months ago

crlandsc commented 9 months ago

Changes:

lucidrains commented 9 months ago

@crlandsc hey Christopher! this looks great! thank you for addressing the multiple stem situation for mel-band roformer; overlooked it

crlandsc commented 9 months ago

No problem, thanks for reviewing so quickly and for this paper implementation! It's fantastic!

Are all of the default args for the mel-roformer based on the paper? Because I was having trouble replicating the exact number of parameters that they detail.

lucidrains commented 9 months ago

thanks for the kind words!

there could be tiny differences, but i think all the main points of the paper are already in there (axial attention, rotary relative positions, and overlapping frequencies bands for mel-band roformer variant)

lucidrains commented 9 months ago

by default i am using the librosa mel bank as they detailed in the paper

crlandsc commented 9 months ago

Ok, that is pretty much what I thought. I was also making sure that I wasn't messing something up on my end as I started to train with it. Thanks again!

lucidrains commented 9 months ago

thanks for the PR!