Open lucellent opened 2 years ago
Thank you for reporting this issue. It is possible that the quieter signal is due to some part of the signal being dropped. With the previous iteration of Demucs I rememeber the sum of all stems was very close to the fully signal, however it is possible this is no longer the case and this should be investigated.
Sadly, on the short term, I am unlikely to provide an automated fix, and you would have to adjust the volume yourself.
Higher overlap is usually better (but also slower to evaluate). zero overlap will result in some discontinuities every 44 seconds.
@lucellent : Could you share the process whereby you adjust the levels to match the original? Matching the original loudness when all of the stems are layered back together matters a lot to certain of my use cases and any advice on how to do so would be greatly appreciated. I am also curious as to any findings you have as to how much overlap might help minimize this issue; these cases are a minority of my use cases and I'll happily trade GPU compute time for better accuracy in this regard.
@awesomer it's not really complicated, you load all the stems + original song in software like Adobe Audition, or Audacity, and invert the original track (invert the phase). When you playback everything now (all of the stems on separate track + original inverted song), you'll hear if there's any loudness mismatch.
For example, if the drums are supposed to be louder, you'll hear drums. Then it's just a matter of manually increasing the volume slowly until you hear silence (that means you've matched the volume).
Hope that explains it?
FTR, I tried the above technique, but it seems that not only do tracks have different volume as a whole, but they also have different volumes throughout their duration, so that if one part of the track is corrected via the above means, other parts are de-corrected.
Are you sure? I've been correcting the volume of the stems for every track I've done and I don't think I've encountered this issue. You will actually hear some noise if all the tracks are at their correct levels and you've inverted them with the original song, it's not supposed to be completely silent. But I'd still love if this issue wasn't there in the first place (doesn't happen all the time, but most of the time)
It certainly seemed that way when I tried to invert "Topdown" by Channel Tres. Specifically the vocal channel I was unable to adjust, it would cancel out for one part but then be significantly audible for other parts. I would be curious if you had the same result with that song as input.
Can you upload some of your files so it will be more convenient for us to find the bug.
@CarlGao4 : My example file here, where the vocal channel seems to become different volumes throughout, is - https://www.dropbox.com/s/9zcgvbtn7kyyy0x/4.%20Channel%20Tres%20-%20Topdown.flac?dl=0
The new demucs v3 is awesome. It really is a big improvement over the last one and it's definitely the best open-source tool right now.
There's one thing that I noticed however, some stems have different loudness levels than others, so in the end, when you combine all 4 stems, the song is not the same as the original output.
The difference varies, but I found out generally the drums are quieter than they should be, around 2-3dB quieter. The vocals too, but no by much.
I figured this out by phase inverting all 4 stems with the original song and trying to match the volume levels until I hear nothing. I don't know if this is a bug, or that's how demucs works, but is there a possible fix? I'm running songs with 0.15 overlap, and when I tried 0.00 overlap it seemed like it might the drums specifically louder, but I think still not enough (I don't know if a lower overlap value means better or worse for the stems)