Closed empz closed 14 hours ago
I agree! Please raise a PR 😊
I agree! Please raise a PR 😊
I've looked into the code and couldn't find any bitrate settings anywhere. Could you explain how the mp3 encoding works? I assume the UVR-based separation models do not output in mp3 natively, is that right? If so, I assumed there was some ffmpeg-based post processing to encode the outputs into mp3, but again couldn't find anything :/
Yeah there isn't currently anything specifically implemented for this, I don't use lossy output from audio-separator
personally (I use FLAC for both input and output format so there's no lossy compression as part of the separation process) so I haven't really spent any time on it.
You're right, it is essentially ffmpeg
doing the output encoding, but it uses the PyDub Python wrapper (https://github.com/jiaaro/pydub) for interacting with ffmpeg
. Here's the line you'll probably want to add functionality to:
From their README it looks like there's a simple bitrate
parameter for the export
method, but you'll need to add a tiny bit of logic to only add that bitrate
parameter for certain output formats (e.g. MP3 or other lossy formats) and make sure you only pass it in if it's a valid value for that format (e.g. only set it to 320k if it's MP3, some other lossy formats probably won't support that bitrate)
Hope that makes sense, and good luck! 🙇
Thanks, I was able to find it and I'm already working on a PR. My plan is to add an optional argument output_mp3_bitrate that only gets applied if the output_format="MP3". Does that sound good enoufh to you?
EDIT: Also making 320k the default for mp3, unless you want to keep the current 128k as default.
Great to hear!
Hmm, ideally I'd probably pass through output_bitrate
as an additional class parameter and CLI parameter, alongside the existing output_format
one, and only set the bitrate
parameter on export()
if that output_bitrate
value is set.
That way you're leaving it up to the user to decide what they want ffmpeg
to do (and that way if some user is using some less common lossy format like m4a
, they can still try to specify bitrate themselves without you needing to test compatibility)
I do think defaulting to 320k for MP3 is a good idea though!
(I hate lossy audio compression generally so would be happy for it to always default to the maximum possible bitrate for any given lossy format by the way, but I'm just thinking of what's the easiest to implement & most flexible for users which is why I'm not wild about an MP3-specific PR just to add output_mp3_bitrate
)
I feel like a generic output_bitrate
would be more versatile but I don't have super strong opinions here - if you raise a PR doing what you think is best and it adds functionality without breaking folks' existing workflows, I'm cool with it!
(I hate lossy audio compression generally so would be happy for it to always default to the maximum possible bitrate for any given lossy format by the way, but I'm just thinking of what's the easiest to implement & most flexible for users which is why I'm not wild about an MP3-specific PR just to add
output_mp3_bitrate
)I feel like a generic
output_bitrate
would be more versatile but I don't have super strong opinions here - if you raise a PR doing what you think is best and it adds functionality without breaking folks' existing workflows, I'm cool with it!
https://github.com/nomadkaraoke/python-audio-separator/pull/104
When using
output_format="MP3"
while instantiating aSeparator
class, the output resuls in files with a bitrate of 128kbps. I belive this should default to 320kbps, but if you don't agree, an extra option to define this would be ideal.