lucidrains / BS-RoFormer

Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs
MIT License
384 stars 13 forks source link

small updates to match the paper's settings #2

Closed shenberg closed 11 months ago

shenberg commented 11 months ago

Section 4.4 specifies "We apply a Hann window size of 2048 and a hop size of 10 ms for STFT to compute the complex spectrogram..." and Section 4.1 says "All recordings are stereo with a sampling rate of 44.1k Hz" so I updated the constants accordingly

lucidrains commented 11 months ago

@shenberg hey Roee! thank you! i don't know the audio hyperparameters well enough

hello from SF as well :)

shenberg commented 11 months ago

Wow that was peak fast-response! Thanks! Also, I'm just moving to Paris, my profile ain't up to date :)

lucidrains commented 11 months ago

from one great city to another!

faroit commented 11 months ago

@shenberg @lucidrains I wouldn't recommend a hop size that isn't //2 or //4 though as this hurts perfect reconstruction abilities of the the stft window

lucidrains commented 11 months ago

@faroit oh, so you would recommend 512?

faroit commented 11 months ago

Yeah. That's much safer

lucidrains commented 11 months ago

ok, let's roll with that!