When the Overlap option is set to something above 8, it makes BS-Roformer skip audio periodically (I only tested with model_bs_roformer_ep_317_sdr_12.9755.ckpt). Whether this is a problem with UVR (with BS-Roformer sharing model settings with MDX-NET) or the model itself (or both), I don't know.
The Test
Using the April 14, 2024 patch, I separated all of the tracks with the segment size set to 1024.
The white track is a baseline separation using the MDX23C, overlap set to 8.
The red track used BS-Roformer (model_bs_roformer_ep_317_sdr_12.9755.ckpt) with the overlap set to 8.
The green track used BS-Roformer (model_bs_roformer_ep_317_sdr_12.9755.ckpt) with the overlap set to 9.
The blue track used BS-Roformer (model_bs_roformer_ep_317_sdr_12.9755.ckpt) with the overlap set to 10.
It seems that BS-Roformer can only process 8 seconds of audio per chunk, then discards the rest of the audio fed into it, given that, in one cycle, the length with audio plus the length without audio is equal to the Overlap setting in seconds.
Solutions?
To prevent this, cap the overlap to 8 when the chosen model is a BS-Roformer model, so that users can't set it to something above 8. I don't know if this also affects other Roformer models, as I haven't tested it deeply.
The Issue
When the Overlap option is set to something above 8, it makes BS-Roformer skip audio periodically (I only tested with
model_bs_roformer_ep_317_sdr_12.9755.ckpt
). Whether this is a problem with UVR (with BS-Roformer sharing model settings with MDX-NET) or the model itself (or both), I don't know.The Test
Using the April 14, 2024 patch, I separated all of the tracks with the segment size set to 1024.![Screenshot 2024-05-24 191725](https://github.com/Anjok07/ultimatevocalremovergui/assets/105120272/ea9b5629-1700-4cd8-ab3d-40c18a01b5f6)
The white track is a baseline separation using the MDX23C, overlap set to 8. The red track used BS-Roformer (
model_bs_roformer_ep_317_sdr_12.9755.ckpt
) with the overlap set to 8. The green track used BS-Roformer (model_bs_roformer_ep_317_sdr_12.9755.ckpt
) with the overlap set to 9. The blue track used BS-Roformer (model_bs_roformer_ep_317_sdr_12.9755.ckpt
) with the overlap set to 10.It seems that BS-Roformer can only process 8 seconds of audio per chunk, then discards the rest of the audio fed into it, given that, in one cycle, the length with audio plus the length without audio is equal to the Overlap setting in seconds.
Solutions?
To prevent this, cap the overlap to 8 when the chosen model is a BS-Roformer model, so that users can't set it to something above 8. I don't know if this also affects other Roformer models, as I haven't tested it deeply.