slhck / ffmpeg-normalize

Audio Normalization for Python/ffmpeg
MIT License
1.28k stars 118 forks source link

Different audio_bitrate for different surround/stereo sound streams #232

Closed joshinils closed 1 year ago

joshinils commented 1 year ago

I have a script with which I normalize movie-audio. All of the streams at once, as if I were to manually invoke ffmpeg-normalize, just without forgetting all the details. I have recently switched from re-muxing the audio in handbrake to copying the streams over. Of course this has resulted in larger file sizes. previously the normalized version was maybe a little to not much bigger than the handbroken file. now, the handbroken is significantly smaller than the normalized version. (and the handbroken is bigger to begin with, of course) example: original: 24.1GB = 18GB video + 6.1GB audio handbroken: 11.8GB = 5.7GB video/subs + 6.1GB copied audio normalized: 19.7GB = 5.7GB video/subs + 14GB audio.

I would like to know if when I add the -c:a option to my script will it use the same bitrate for all streams? I would want to use different ones if it is stereo / 5.1 / 7.2, of course, is that possible, and if so how?

The original sample_rate is 48KHz. I see that the sample rate is the same on the input, and thus I chose to use the same as the first audio stream for -ar.

The bitrate is 448 kb/s for 5.1 192 kb/s for stereo

And the Handbreak-default for audio-remuxing is 160 kb/s, which is lower than both, meaning of course it produced more compressed audio.

Even a plain ffmpeg -i audio.mkv -map 0 audio_2.mkv -y` shrinks the 6.1GB to 1.8GB. But I can not get info from ffprobe which bitrate the streams have N/A

ffprobe audio.mkv ``` Stream #0:0(ger): Audio: truehd, 48000 Hz, 5.1(side), s32 (24 bit) (default) Metadata: title : Surround 5.1 DURATION : 01:37:58.873000000 Stream #0:1(ger): Audio: truehd, 48000 Hz, 5.1(side), s32 (24 bit) Metadata: title : Surround 5.1 DURATION : 01:37:58.873000000 Stream #0:2(ger): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s Metadata: title : Surround 5.1 DURATION : 01:37:58.880000000 Stream #0:3(eng): Audio: truehd, 48000 Hz, 5.1(side), s32 (24 bit) Metadata: title : Surround 5.1 DURATION : 01:37:58.873000000 Stream #0:4(eng): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s Metadata: title : Surround 5.1 DURATION : 01:37:58.880000000 Stream #0:5(eng): Audio: ac3, 48000 Hz, stereo, fltp, 192 kb/s Metadata: title : Stereo DURATION : 01:37:58.880000000 Stream #0:6(eng): Audio: ac3, 48000 Hz, stereo, fltp, 192 kb/s Metadata: title : Stereo DURATION : 01:37:58.880000000 Stream #0:7(cze): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s Metadata: title : Surround 5.1 DURATION : 01:37:58.880000000 Stream #0:8(hun): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s Metadata: title : Surround 5.1 DURATION : 01:37:58.880000000 Stream #0:9(pol): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s Metadata: title : Surround 5.1 DURATION : 01:37:58.880000000 Stream #0:10(rus): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s Metadata: title : Surround 5.1 DURATION : 01:37:58.880000000 ```
ffprobe audio_2.mkv ``` Stream #0:0(ger): Audio: vorbis, 48000 Hz, 5.1, fltp (default) Metadata: title : Surround 5.1 ENCODER : Lavc58.134.100 libvorbis DURATION : 01:37:58.876000000 Stream #0:1(ger): Audio: vorbis, 48000 Hz, 5.1, fltp Metadata: title : Surround 5.1 ENCODER : Lavc58.134.100 libvorbis DURATION : 01:37:58.876000000 Stream #0:2(ger): Audio: vorbis, 48000 Hz, 5.1, fltp Metadata: title : Surround 5.1 ENCODER : Lavc58.134.100 libvorbis DURATION : 01:37:58.883000000 Stream #0:3(eng): Audio: vorbis, 48000 Hz, 5.1, fltp Metadata: title : Surround 5.1 ENCODER : Lavc58.134.100 libvorbis DURATION : 01:37:58.877000000 Stream #0:4(eng): Audio: vorbis, 48000 Hz, 5.1, fltp Metadata: title : Surround 5.1 ENCODER : Lavc58.134.100 libvorbis DURATION : 01:37:58.883000000 Stream #0:5(eng): Audio: vorbis, 48000 Hz, stereo, fltp Metadata: title : Stereo ENCODER : Lavc58.134.100 libvorbis DURATION : 01:37:58.883000000 Stream #0:6(eng): Audio: vorbis, 48000 Hz, stereo, fltp Metadata: title : Stereo ENCODER : Lavc58.134.100 libvorbis DURATION : 01:37:58.883000000 Stream #0:7(cze): Audio: vorbis, 48000 Hz, 5.1, fltp Metadata: title : Surround 5.1 ENCODER : Lavc58.134.100 libvorbis DURATION : 01:37:58.883000000 Stream #0:8(hun): Audio: vorbis, 48000 Hz, 5.1, fltp Metadata: title : Surround 5.1 ENCODER : Lavc58.134.100 libvorbis DURATION : 01:37:58.883000000 Stream #0:9(pol): Audio: vorbis, 48000 Hz, 5.1, fltp Metadata: title : Surround 5.1 ENCODER : Lavc58.134.100 libvorbis DURATION : 01:37:58.883000000 Stream #0:10(rus): Audio: vorbis, 48000 Hz, 5.1, fltp Metadata: title : Surround 5.1 ENCODER : Lavc58.134.100 libvorbis DURATION : 01:37:58.883000000 ```

Excerpt from my small script:

codec=$(ffprobe -v error -select_streams a:0 -show_entries stream=codec_name -of default=noprint_wrappers=1:nokey=1 "./${1}")
sample_rate=$(ffprobe -v error -select_streams a -of default=noprint_wrappers=1:nokey=1 -show_entries stream=sample_rate "./${1}"|head -n1)
ffmpeg-normalize -pr -f -ar "${sample_rate}" --extension $extension -c:a "${codec}" "./${1}" -e "-strict -2"
slhck commented 1 year ago

I would like to know if when I add the -c:a option to my script will it use the same bitrate for all streams?

This is the overall bitrate that applies to all streams; there is no way to set it per-stream and let ffmpeg figure out the rest. (Ticket here)

So you need to first figure out the number of channels, e.g. like this:

numChannels=$(ffprobe "$input" -show_entries stream=channels -of compact=p=0:nk=1 -v 0)

Then calculate the required bitrate beforehand.

Does this answer your question?

joshinils commented 1 year ago

hm, I'm afraid I'm still confused. when I do that it prints the number of channels per stream, which is interesting, but does that help?

6
6
6
6
6
2
2
6
6
6
6

of course, a 5.1 stream with 6 channels has a different bitrate to a stereo stream, that's why I'm asking. I don't want to give the 5.1 too little bitrate and a stereo stream too much. I sort of only want to specify the bitrate per channel, not per stream if possible.

Otherwise, would I need to extract every audio stream and normalize it by itself? would that then produce the same level as when normalizing a file with multiple streams?

slhck commented 1 year ago

I don't want to give the 5.1 too little bitrate and a stereo stream too much. I sort of only want to specify the bitrate per channel, not per stream if possible.

Yes, so you need to define the bitrate you want per channel (e.g. 96 kBit/s) and multiply the number of channels with it. This is the bitrate you will pass to ffmpeg-normalize. You would simply have to adapt your script to do the multiplication.

joshinils commented 1 year ago

I understand the multiplication part, but does ffmpeg-normalize apply the same bitrate for a stereo stream and a 6-channel or 8-channel stream?

slhck commented 1 year ago

It simply uses ffmpeg to pass the bitrate, so it's the total bitrate for all streams.

joshinils commented 1 year ago

ok, so ffmpeg can't do what I want. :crying_cat_face:. I extracted a 1 min clip for testing, and normalized it with: ffmpeg-normalize -pr -f -ar "${sample_rate}" --extension $extension -c:a ac3 -b:a 160000 "./${1}" -e "-strict -2" >> "${1}".log 2>&1 then ffprobe shows:

  Stream #0:6(eng): Audio: ac3, 48000 Hz, 5.1(side), fltp, 160 kb/s
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 ac3
      DURATION        : 00:01:00.000000000
  Stream #0:7(eng): Audio: ac3, 48000 Hz, stereo, fltp, 160 kb/s
    Metadata:
      title           : Stereo
      ENCODER         : Lavc58.134.100 ac3
      DURATION        : 00:01:00.000000000

Which indeed means that the stereo-channel got three times as much bitrate as the 6-channel 5.1 stream, which is not what I want.

So, going forward; Will normalizing each channel by itself work? Will they be the same level when combined afterward, as if normalized together in one file?

slhck commented 1 year ago

I don't think this will work. The normalization filter takes into account all channels while calculating the statistics. If loudness should be changed, channels should not be treated in isolation, as that might make individual channels too loud/soft.

Is there anything that prevents you from setting the bitrate as needed depending on the number of channels? Maybe I didn't fully grasp your use case.

joshinils commented 1 year ago

maybe my english is the problem, channel != stream. one stream has multiple channels, one stereo-stream has two channels, one 5.1 stream has 6 channels, i want to set the bitrate per stream, not per channel. since one movie has multiple streams, with different channel-counts.

I am experimenting with using -b:a:0 and -b:a:1 etc, though I don't know to which specific number(s) I can set the bitrate (and my google-fu is too weak in this case), it seems to be some multiple of 32k, but sometimes ffmpeg chooses another bitrate than the one I set here:

#!/usr/bin/env bash

ffmpeg -i 1_minute.mkv -map 0 -codec copy -c:a ac3 \
-b:a:0  64k \        #     <--------- 32k seems to be simply too little, ffmpeg complained aobut 32k
-b:a:1  64k \
-b:a:2  96k \
-b:a:3  128k \
-b:a:4  160k \
-b:a:5  192k \
-b:a:6  224k \
-b:a:7  256k \
-b:a:8  288k \
-b:a:9  320k \
-b:a:10 352k \
-b:a:11 384k \
-b:a:12 416k \
out.mkv -y
  Stream #0:1(ger): Audio: ac3, 48000 Hz, 5.1(side), fltp, 64 kb/s (default)
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 ac3
      DURATION        : 00:01:00.000000000
  Stream #0:2(ger): Audio: ac3, 48000 Hz, 5.1(side), fltp, 64 kb/s
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 ac3
      DURATION        : 00:01:00.000000000
  Stream #0:3(ger): Audio: ac3, 48000 Hz, 5.1(side), fltp, 96 kb/s
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 ac3
      DURATION        : 00:01:00.000000000
  Stream #0:4(ger): Audio: ac3, 48000 Hz, 5.1(side), fltp, 128 kb/s
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 ac3
      DURATION        : 00:01:00.000000000
  Stream #0:5(eng): Audio: ac3, 48000 Hz, 5.1(side), fltp, 160 kb/s
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 ac3
      DURATION        : 00:01:00.000000000
  Stream #0:6(eng): Audio: ac3, 48000 Hz, 5.1(side), fltp, 192 kb/s
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 ac3
      DURATION        : 00:01:00.000000000
  Stream #0:7(eng): Audio: ac3, 48000 Hz, stereo, fltp, 224 kb/s
    Metadata:
      title           : Stereo
      ENCODER         : Lavc58.134.100 ac3
      DURATION        : 00:01:00.000000000
  Stream #0:8(eng): Audio: ac3, 48000 Hz, stereo, fltp, 256 kb/s
    Metadata:
      title           : Stereo
      ENCODER         : Lavc58.134.100 ac3
      DURATION        : 00:01:00.000000000
  Stream #0:9(cze): Audio: ac3, 48000 Hz, 5.1(side), fltp, 256 kb/s        #     <--------- I set it to 288k
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 ac3
      DURATION        : 00:01:00.000000000
  Stream #0:10(hun): Audio: ac3, 48000 Hz, 5.1(side), fltp, 320 kb/s
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 ac3
      DURATION        : 00:01:00.000000000
  Stream #0:11(pol): Audio: ac3, 48000 Hz, 5.1(side), fltp, 320 kb/s        #     <--------- I set it to 352k
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 ac3
      DURATION        : 00:01:00.000000000
  Stream #0:12(rus): Audio: ac3, 48000 Hz, 5.1(side), fltp, 384 kb/s
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 ac3
      DURATION        : 00:01:00.000000000

I also don't know if I want to use ac3 or aac, just for testing now I have to use ac3 since aac does not show the bitrate with ffprobe

slhck commented 1 year ago

Oh, sorry for the confusion. Now I get the problem. I also didn't catch the difference between streams and channels in your description while it should have been obvious to me.

Indeed you have multiple streams and need to set a different bitrate for each. This can't be done with ffmpeg-normalize, and in this case you could theoretically normalize the individual streams first, then merge them back. Ignore my previous comment there, as it only talks about channels.

If you work with ffmpeg alone, specifying the index like -b:a:0 is the right way to go to set the bitrate for one stream.

I would probably just pick AAC for it being not as specific about the encoding bitrate. You should however see the resulting bitrate in ffprobe when using AAC? I have to check this.

joshinils commented 1 year ago

same script as above with aac, ffprobe prints:

  Stream #0:1(ger): Audio: aac (LC), 48000 Hz, 6 channels, fltp (default)
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 aac
      DURATION        : 00:01:00.021000000
  Stream #0:2(ger): Audio: aac (LC), 48000 Hz, 6 channels, fltp
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 aac
      DURATION        : 00:01:00.021000000
  Stream #0:3(ger): Audio: aac (LC), 48000 Hz, 6 channels, fltp
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 aac
      DURATION        : 00:01:00.021000000
  Stream #0:4(ger): Audio: aac (LC), 48000 Hz, 6 channels, fltp
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 aac
      DURATION        : 00:01:00.021000000
  Stream #0:5(eng): Audio: aac (LC), 48000 Hz, 6 channels, fltp
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 aac
      DURATION        : 00:01:00.021000000
  Stream #0:6(eng): Audio: aac (LC), 48000 Hz, 6 channels, fltp
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 aac
      DURATION        : 00:01:00.021000000
  Stream #0:7(eng): Audio: aac (LC), 48000 Hz, stereo, fltp
    Metadata:
      title           : Stereo
      ENCODER         : Lavc58.134.100 aac
      DURATION        : 00:01:00.021000000
  Stream #0:8(eng): Audio: aac (LC), 48000 Hz, stereo, fltp
    Metadata:
      title           : Stereo
      ENCODER         : Lavc58.134.100 aac
      DURATION        : 00:01:00.021000000
  Stream #0:9(cze): Audio: aac (LC), 48000 Hz, 6 channels, fltp
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 aac
      DURATION        : 00:01:00.021000000
  Stream #0:10(hun): Audio: aac (LC), 48000 Hz, 6 channels, fltp
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 aac
      DURATION        : 00:01:00.021000000
  Stream #0:11(pol): Audio: aac (LC), 48000 Hz, 6 channels, fltp
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 aac
      DURATION        : 00:01:00.021000000
  Stream #0:12(rus): Audio: aac (LC), 48000 Hz, 6 channels, fltp
    Metadata:
      title           : Surround 5.1
      ENCODER         : Lavc58.134.100 aac
      DURATION        : 00:01:00.021000000
slhck commented 1 year ago

Can you attempt to get the bitrate like shown here? https://superuser.com/a/1541252/

joshinils commented 1 year ago

Can you attempt to get the bitrate like shown here? https://superuser.com/a/1541252/

even for all streams, not just a:0, i.e. a:

ffprobe -v 0 -select_streams a -show_entries stream=bit_rate -of compact=p=0:nk=1 out.mkv 
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A

:shrug:

slhck commented 1 year ago

I can reproduce this. Seems that the bitrate information is not written to the file header when using .mkv. More information here:

https://github.com/HandBrake/HandBrake/issues/1609

If you instead mux to .mp4, you will get the bitrate information.