svt / encore

Transcode media files in an epic manner
European Union Public License 1.2
273 stars 25 forks source link

Question: How to deal with different track allocation in input with 8 audio streams? #29

Open grusell opened 5 months ago

grusell commented 5 months ago

I want to transcode stereo audio from existing stereo mix in inputs. I have two different types of inputs:

a) Audio is in 8 mono streams, with StereoLeft allocated to stream 0 and StereoRight allocated to stream 1 b) Audio is in 8 mono streams, with StereoLeft allocated to stream 6 and StereoRight allocated to stream 7

I cannot seem to find a non-hacky way to handle this. The best way I found is to define a pan-mapping from 7.1(wide) to stereo that selects stream 6 and 7 for stereo. Then if I set channelLayout on the job input to 7.1(wide) I get what I want in case b. For case a, I set a default-pan for stereo that just uses stream 0 and 1. This solution feels a bit hacky though since the channel layout in case b is not really 7.1(wide). Is there currently a better way to handle this?

fhermansson commented 3 months ago

No, currently there is no better way to handle this. But we plan to do improvements on handling audio layouts. Maybe adding the possibility to define custom channel layouts would help? Like DL+DR|7.1(wide), and perhaps combined with labels to separate audio, so that the downmix can be handled by one audio encode, and the 7.1 by another?

grusell commented 3 months ago

Yes, I think a possibility to define custom channel layouts would be a good solution in this case.

grusell commented 1 month ago

Revisiting this, I found what I believe is a cleaner solution: parameterizing the audioMixPreset value in the profile. That way I can get the downmix I want by specifying the correct audioMixPreset in profileParams when I create a job. Example config below.

Example profile

name: audio-test
description: Testing audio
scaling: bicubic
encodes:
  - type: X264Encode
    suffix: _x264_720
    twoPass: false
    height: 720
    params:
      b:v: 5400k
      maxrate: 8100k
      bufsize: 8100k
      r: 25
      fps_mode: cfr
      pix_fmt: yuv420p
      force_key_frames: expr:not(mod(n,96))
      profile:v: high
      level: 5.1
      preset: ultrafast
    x264-params:
      keyint: 192
      keyint_min: 96
    audioEncode:
      optional: true
      type: AudioEncode
      codec: aac
      bitrate: 128k
      suffix: STEREO
      audioMixPreset: #{profileParams['audioMixPreset']?:"default"}

application.yml

encore-settings:
  encoding:
    audio-mix-presets:
      default:
        default-pan:
          stereo: FL=FL|FR=FR
          "[5.1]": c0=c2|c1=c3|c2=c4|c3=c5|c4=c6|c5=c7
        pan-mapping:
          mono:
            stereo: FL=0.707*FC|FR=0.707*FC
      stereo-on-78:
        default-pan:
          stereo: FL=c6|FR=c7
          "[5.1]": c0=c0|c1=c1|c2=c2|c3=c3|c4=c4|c5=c5
fhermansson commented 1 week ago

That's a clever solution! But I still believe we'll have to improve handling of channel layouts. We should support layouts that aren't defined as standard in Ffmpeg, at least as input.