Chocobozzz / PeerTube

ActivityPub-federated video streaming platform using P2P directly in your web browser
https://joinpeertube.org/
GNU Affero General Public License v3.0
12.77k stars 1.46k forks source link

Poor quality of ffmpeg aac encoder #5652

Open artenax opened 1 year ago

artenax commented 1 year ago

Describe the current behavior

Hello. I understand you can not give up ffmpeg aac in favor of fdk (-vbr 5) or at least faac for licensing reasons, but at least change the CBR mode to VBR, according to my tests this will greatly improve quality. The bitrate will be about 200 kbps:

-c:a aac -b:a 256k

V

-c:a aac -q:a 2.3 -cutoff 18000

One more thing: Right now the peertube encoder does not touch (does not transcode) the aac stream provided by user, only in the mp4 container. So the user can provide his own quality aac (encoded by fdk or commercial qaac). But if the aac is in an mkv container, peertube always re-encodes the audio (because it can't detect the bitrate of aac from the mkv container?). You could also make aac sound not re-encode in big cases.

You would have to change something. ffmpeg aac 256k CBR used now is terrible. Just listen with headphones where there is a lot of high frequencies. Actually, not only VBR mode improves quality (it reduces encoder bugs), but also the -cutoff frequency filter. YouTube has always cut off AAC at 16000 Hz.

Steps to reproduce

-

Describe the expected behavior

-

Additional information

ElonVampire commented 1 year ago

收到!

artenax commented 1 year ago

It is also useful to add -af volume=-3dB to prevent clipping and artifacts.

Chocobozzz commented 1 year ago

Hello,

We should use fdk aac encoder if the ffmpeg binary supports it :thinking:

Regarding the variable bitrate for the default aac encoder, I see

Effective range for -q:a is around 0.1-2. This VBR is experimental and likely to get even worse results than the CBR.

in https://trac.ffmpeg.org/wiki/Encode/AAC page

artenax commented 1 year ago

Since ffmpeg 3.0+ (in 2016) the native aac encoder has been significantly revised and improved. The developers call the CBR mode stable. However, my testing shows that ffmpeg 3.0+ aac now has other bugs, which especially show up in CBR mode and much less in VBR. Also the -cutoff 18000 and -af volume=-2dB options are needed to reduce this artifacts. I can provide you audio samples...

ROBERT-MCDOWELL commented 1 year ago

fdk-aac is much better than aac I switched to it years ago indeed.

Chocobozzz commented 1 year ago

Thanks @artenax, yes please send your audio sample. Can you also confirm that PeerTube uses libfdk if your ffmpeg binary supports it?

vid-bin commented 1 year ago

I just use opus. The latest version of peertube with git ffmpeg seems to support both firefox and chrome now. Previously opus would only work on one browser and not the other with peertube/hls.

artenax commented 1 year ago

Effective range for -q:a is around 0.1-2

However, ffmpeg accepts values higher than 2.0 outside the documentation and the bitrate is higher.

Can you also confirm that PeerTube uses libfdk if your ffmpeg binary supports it?

I don't know about the server side of the PeerTube I use, but it seems that fdk is not used there. I'm just a user/uploader.

fdk-aac is much better than aac I switched to it years ago indeed

Me too. Although, fdk works in 16 bit and we need to be careful with the levels.

Here are examples. (cbr256.m4a) ffmpeg -c:a aac -b:a 256k and (vbr2.3.m4a 210 kbps) ffmpeg -c:aac -q:a 2.3 I don't know if you can hear the difference with the original and between CBR<>VBR in your headphones. Pay attention to the high frequencies.

Try also adding the options -cutoff 18000 -af volume=-2dB (I didn't add them here).

original.flac actually went some way beyond me and it's not exactly lossless (high bitrate aac 317 kbps when exporting from video editing software > youtube opus 137k > flac). But this is a real case and a good stress test.

original.zip aac.zip

ROBERT-MCDOWELL commented 1 year ago

it's really rare that frequencies go up to 16khz in music, it's often under -10/20db and not really perceptive in digital audio. about clipping, the most efficient I set is -c:a libfdk_aac -b:a 256k -af 'loudnorm=I=-13:LRA=20:TP=-2'} with this setting whatever is voice noise or music, the presence/dynamic is great and no clipping at all. It's really close to YT even better sometimes.

artenax commented 1 year ago

Guys, it's funny, but I checked some other encoders and they gave good results: ac3 (sonic and ffmpeg) 192k, dts 320k, wma9 std VBR V90 190k and wma9 pro VBR V90 160k (microsoft, from winxp), vorbis vbr (ffmpeg) 128k. And even lame mp3 --cbr -b 128 -q 0 -m j --lowpass 16 had artifacts, but less. ffmpeg aac 128k is even worse. faac is not very good.

hsn10 commented 1 year ago

ffmpeg aac encoder is garbage. You need to have fdk version.

FiskFan1999 commented 1 year ago

For now, this could be fixed with a plugin which changes which encoders are used (such as the vp9 or opus plugins).

Kinuseka commented 4 months ago

Any temporary solutions so far? Audio quality is really bad.

@FiskFan1999 I believe we can use peertube-plugin-transcoding-profile-debug temporarily but I am not sure how to do it yet.

artenax commented 4 months ago

Can you use -c:a aac -q:a 2.4 -cutoff 18000 -af volume=-2dB ? It's not that bad. It's inaccurate, but not terrible. Try experimenting with advanced settings: ffmpeg -h encoder=aac

AAC encoder AVOptions:
  -aac_coder         <int>        E...A...... Coding algorithm (from 0 to 2) (default twoloop)
     anmr            0            E...A...... ANMR method
     twoloop         1            E...A...... Two loop searching method
     fast            2            E...A...... Default fast search
  -aac_ms            <boolean>    E...A...... Force M/S stereo coding (default auto)
  -aac_is            <boolean>    E...A...... Intensity stereo coding (default true)
  -aac_pns           <boolean>    E...A...... Perceptual noise substitution (default true)
  -aac_tns           <boolean>    E...A...... Temporal noise shaping (default true)
  -aac_ltp           <boolean>    E...A...... Long term prediction (default false)
  -aac_pred          <boolean>    E...A...... AAC-Main prediction (default false)
  -aac_pce           <boolean>    E...A...... Forces the use of PCEs (default false)

Unfortunately, I don't know if PeerTube uses ffmpeg CLI or is controlled through libraries and if you can change additional settings.

Chocobozzz commented 4 months ago

Any temporary solutions so far? Audio quality is really bad.

Can you give us example and also tell us if your ffmpeg version has libfdk_aac encoder?

Kinuseka commented 4 months ago

@Chocobozzz said:

Any temporary solutions so far? Audio quality is really bad.

Can you give us example and also tell us if your ffmpeg version has libfdk_aac encoder?

Hi yes, transcoded: https://vid.kinuseka.us/w/2Sr3d9csWFcVoSSm4S5pFb compared to the original soundcloud: https://soundcloud.com/idnull/midnight-murder-club

It is noticeable that the high frequencies on the music sound compressed and not fully defined.

There is in fact no libfdk_aac encoder yet, let me see if installing this makes a difference

Chocobozzz commented 4 months ago

@Kinuseka Can you tell me if this version is better? https://asso.framasoft.org/drop/r/ULju3jwfkF#NMvYQWsIrqPaKhK2RpdpLkFAUppWrJ+YocYKXqb9ohw=

And can you give your ffmpeg version?

Kinuseka commented 4 months ago

@Chocobozzz said: @Kinuseka Can you tell me if this version is better? https://asso.framasoft.org/drop/r/ULju3jwfkF#NMvYQWsIrqPaKhK2RpdpLkFAUppWrJ+YocYKXqb9ohw=

And can you give your ffmpeg version?

Oh yeah, that sounds really close to the original.

FFmpeg version: ffmpeg version 4.4.2-0ubuntu0.22.04.1

Chocobozzz commented 4 months ago

Oh yeah, that sounds really close to the original.

Ok thanks, because I just used the same upload process as you on my local PeerTube that uses ffmpeg 6.1.1 Can you upgrade your ffmpeg version to 6.1 and retry the upload process to see if it fixes the issue?

Kinuseka commented 4 months ago

@Chocobozzz said:

Oh yeah, that sounds really close to the original.

Ok thanks, because I just used the same upload process as you on my local PeerTube that uses ffmpeg 6.1.1 Can you upgrade your ffmpeg version to 6.1 and retry the upload process to see if it fixes the issue?

ffmpeg 6.1 fixes the audio quality issue. Had to use this repo since there is no official release on ffmpeg v6 for ubuntu 22.04.

artenax commented 4 months ago

ffmpeg 6.1 fixes the audio quality issue

Yeah, spectra got better in ffmpeg >= 6.0, but the sound still sucks (in CBR mode). VBR (-q:a) mode is cool. It's most noticeable on the human voice. (when people howl? something like that) VBR mode in ffmpeg 6.1.1 seems even better.

artenax commented 4 months ago

If you need ffmpeg 6 with fdk-aac use this static build (put it in /usr/local/bin). It is legal because it contains a Fedora patch that removes patented components (HE profiles). Or install Fedora on the server... Although, no, Fedora doesn't have libx264, unlike my build.

However, there is no zlib there. If you need zlib use this:

ffmpeg version 6.0 Copyright (c) 2000-2023 the FFmpeg developers built with gcc 5.5.0 (Linaro GCC 5.5-2017.10) 20171010 (ROSA) configuration: --prefix=/opt/ffmpeg --enable-pic --enable-gpl --enable-version3 --enable-static --disable-shared --disable-debug --as=nasm --enable-small --disable-doc --enable-gray --enable-libfdk-aac --enable-libx264 --disable-cuda-nvcc --disable-cuda-llvm --disable-lzma --disable-vaapi --disable-vdpau --disable-xlib --disable-libxcb --disable-vulkan --pkg-config-flags=--static --enable-libmp3lame --enable-libvorbis --enable-libopus --enable-libvpx --enable-libtls

Unfortunately, without SSE optimizations. And there's no AV1 decoder. Anyway, you can compile ffmpeg yourself, it's easy.

Kinuseka commented 3 months ago

Can you use -c:a aac -q:a 2.4 -cutoff 18000 -af volume=-2dB ? It's not that bad. It's inaccurate, but not terrible. Try experimenting with advanced settings: ffmpeg -h encoder=aac

AAC encoder AVOptions:
  -aac_coder         <int>        E...A...... Coding algorithm (from 0 to 2) (default twoloop)
     anmr            0            E...A...... ANMR method
     twoloop         1            E...A...... Two loop searching method
     fast            2            E...A...... Default fast search
  -aac_ms            <boolean>    E...A...... Force M/S stereo coding (default auto)
  -aac_is            <boolean>    E...A...... Intensity stereo coding (default true)
  -aac_pns           <boolean>    E...A...... Perceptual noise substitution (default true)
  -aac_tns           <boolean>    E...A...... Temporal noise shaping (default true)
  -aac_ltp           <boolean>    E...A...... Long term prediction (default false)
  -aac_pred          <boolean>    E...A...... AAC-Main prediction (default false)
  -aac_pce           <boolean>    E...A...... Forces the use of PCEs (default false)

Unfortunately, I don't know if PeerTube uses ffmpeg CLI or is controlled through libraries and if you can change additional settings.

Trying this setting using peertube-plugin-transcoding-profile-debug

my configuration is: Transcoding profile:

{
    "vod": [
        {
            "encoderName": "aac",
            "profileName": "AudioBetter",
            "outputOptions": [
                "-c:a aac",
                "-q:a 2.3", 
                "-cutoff 18000"
            ]
        }
    ],
    "live": []
}

Encoders Priorities:

{
  "vod": [
    {
      "encoderName": "aac",
      "streamType": "audio",
      "priority": 1000
    }
  ],

  "live": [ ]
}

Untitled

I can confirm it is being implemented since the setting is present on htop during transcoding

Kinuseka commented 3 months ago

With ffmpeg 6.1 the audio has significantly improved with aac vs 4.4 aac

I am not entirely sure if VBR made any difference since I can't tell the difference. It could just be placebo but I guess some parts of the frequency are less crunchy and more detailed now (mid-highs maybe)