ressu / kube-plex

Scalable Plex Media Server on Kubernetes -- dispatch transcode jobs as pods on your cluster!
Apache License 2.0
102 stars 24 forks source link

HW Transcoding doesn't work #42

Closed kulisau closed 5 months ago

kulisau commented 1 year ago

I'm running k3s on 3 intel j4105 based PCs.

If I define resources of kubePlex, the deployment does not seem to care about these settings and Plex does not transcode using iGPU. If I define the same resources in the bottom resources section of the values.yml file, then transcoding pods just crashes. The sad part is those pods does not generate any output so it's really difficult to understand the reason. The only way to get HW transcoding working is to disable kubePlex and leave resources configured, but that's just doesn't make sense.

Am I doing something wrong or is there any other approach to enable Intel iGPU transcoding on kubePlex? Thanks!

ressu commented 1 year ago

Could you include your configuration here for the relevant parts. That makes it easier to understand what could be wrong.

Your point about limited logs is valid though. A lot of the logs are collected in the kubernetes eventlog. It might be possible to pull some of this information to the plex log so that it's easier to see there too. I'll mark it down as a feature request so that the brainstorming on how that could work has a separate discussion point.

kulisau commented 1 year ago

Sure!

Btw, munnerz/kube-plex version of transcoder pods does not vanish after failure, this allows to read logs.

Kubernetes version: v1.23.16+k3s1 values.yml:

# Default values for kube-plex.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

# Default image configuration for ease of deployment
image:
#  repository: lscr.io/linuxserver/plex
  repository: plexinc/pms-docker
  tag: latest
  pullPolicy: Always

# More recommended image configuration is to use a specific tag to make upgrades
# predictable. For example:
# image:
#   repository: plexinc/pms-docker
#   tag: 1.24.5.5173-8dcc73a59
#   pullPolicy: IfNotPresent

kubePlex:
  enabled: true
  loglevel: ""
  image:
    repository: ghcr.io/ressu/kube-plex
    tag: latest
    pullPolicy: Always
  resources:
    limits:
      gpu.intel.com/i915: "1"

  # Mounts which should be carried over to kube-plex transcoder. By default only
  # /data and /transcode are cloned to the transcoding pod. A comma separated list.
  #
  ## mounts: /data,/transcode

# Override this with the plex claim token from plex.tv/claim
claimToken: ""

# Set the timezone of the plex server
timezone: Europe/Vilnius

# Set extra environment variables
extraEnv: {}
  # puid: 1002
  # pgid: 1002
  # NVIDIA_VISIBLE_DEVICES: all
  # NVIDIA_DRIVER_CAPABILITIES: video,compute,utility

service:
  type: ClusterIP
  port: 32400
  ## Specify the nodePort value for the LoadBalancer and NodePort service types.
  ## ref: https://kubernetes.io/docs/concepts/services-networking/service/#type-nodeport
  ##
  # nodePort:
  ## Provide any additional annotations which may be required. This can be used to
  ## set the LoadBalancer service type to internal only.
  ## ref: https://kubernetes.io/docs/concepts/services-networking/service/#internal-load-balancer
  ##
  annotations: {}
  labels: {}
  ## Use loadBalancerIP to request a specific static IP,
  ## otherwise leave blank
  ##
  loadBalancerIP:
  # loadBalancerSourceRanges: []
  ## Set the externalTrafficPolicy in the Service to either Cluster or Local
  # externalTrafficPolicy: Cluster

# Probe configuration
# ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes
probes:
  # Liveness probe configuration
  liveness:
    # Enable the liveness probe
    enabled: true
    # Set custom to `true` if you wish to specify your own liveness probe
    custom: false
    # The spec field contains the values for the default livenessProbe.
    # If you selected `custom: true`, the spec field holds the definition of the livenessProbe.
    spec:
      initialDelaySeconds: 0
      periodSeconds: 10
      timeoutSeconds: 1
      failureThreshold: 3

  # Redainess probe configuration
  readiness:
    # Enable the readiness probe
    enabled: true
    # Set custom to `true` if you wish to specify your own readiness probe
    custom: false
    # The spec field contains the values for the default readinessProbe.
    # If you selected `custom: true`, this field holds the definition of the readinessProbe.
    spec:
      initialDelaySeconds: 0
      periodSeconds: 10
      timeoutSeconds: 1
      failureThreshold: 3

  # Startup probe configuration
  startup:
    # Enable the startup probe
    enabled: true
    # Set custom to `true` if you wish to specify your own startup probe
    custom: false
    # The spec field contains the values for the default startupProbe.
    # If you selected `custom: true`, this field holds the definition of the startupProbe.
    spec:
      initialDelaySeconds: 0
      timeoutSeconds: 1
      periodSeconds: 5
      failureThreshold: 30

ingress:
  enabled: false
  # Used to create an Ingress record.
  hosts:
    - chart-example.local
  annotations:
    # kubernetes.io/ingress.class: nginx
    # kubernetes.io/tls-acme: "true"
  tls:
    # Secrets must be manually created in the namespace.
    # - secretName: chart-example-tls
    #   hosts:
    #     - chart-example.local

rbac:
  create: true
  # Specify create: false and serviceAccountName to manually manage the service
  # account for this deployment
  ## serviceAccountName: ""

nodeSelector:
  kubernetes.io/arch: amd64

persistence:
  transcode:
    # Optionally specify claimName to manually override the PVC to be used for
    # the transcode directory. If claimName is specified, storageClass and size
    # are ignored.
    ## claimName: "plex-transcode-pvc"
    # Optionally specify a storage class to be used for the transcode directory.
    # If not specified and claimName is not specified, the default storage
    # class will be used.
    storageClass: ""
    # subPath: some-subpath
    # The requested size of the volume to be used when creating a
    # PersistentVolumeClaim.
    size: 20Gi
    # Access mode for this volume
    accessMode: ReadWriteMany
  data:
    # Optionally specify claimName to manually override the PVC to be used for
    # the data directory. If claimName is specified, storageClass and size are
    # ignored.
    ## claimName: "plex-data-pvc"
    # Optionally specify a storage class to be used for the data directory.
    # If not specified and claimName is not specified, the default storage
    # class will be used.
    storageClass: ""
    # subPath: some-subpath
    # The requested size of the volume to be used when creating a
    # PersistentVolumeClaim.
    size: 40Gi
    # Access mode for this volume
    accessMode: ReadWriteMany
  extraData: []
    # Optionally specifify additional Data mounts.  These will be mounted as
    # /data-${name}.  This should be in the same format as the above 'data',
    # with the additional field 'name'
    # - claimName: "special-tv"
    #   name: 'foo'

  config:
    # Optionally specify claimName to manually override the PVC to be used for
    # the config directory. If claimName is specified, storageClass and size
    # are ignored.
    ## claimName: "plex-config-pvc"
    # Optionally specify a storage class to be used for the config directory.
    # If not specified and claimName is not specified, the default storage
    # class will be used.
    # subPath: some-subpath
    storageClass: ""
    # The requested size of the volume to be used when creating a
    # PersistentVolumeClaim.
    size: 20Gi
    # Access mode for this volume
    accessMode: ReadWriteMany

resources: {}
#  limits:
#    gpu.intel.com/i915: 1
  # We usually recommend not to specify default resources and to leave this as a conscious
  # choice for the user. This also increases chances charts run on environments with little
  # resources, such as Minikube. If you do want to specify resources, uncomment the following
  # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
  # limits:
  #  cpu: 100m
  #  memory: 128Mi
  # requests:
  #  cpu: 100m
  #  memory: 128Mi

podAnnotations: {}

deploymentAnnotations: {}

proxy:
  # This allows to set a proxy environment variable, which PMS uses to fetch the token and assets like movie cover
  enable: false
  # http: "http://proxy:8080"
  # https: "https://proxy:8080"
  # noproxy: "localhost,127.0.0.1,10.96.0.0/12,10.244.0.0/12"

# allows setting which taints kubeplex tolerates
tolerations: []

# allows specifying node affinity
affinity: 
ressu commented 1 year ago

The config seems to be as it's supposed to. You mentioned that the job crashes. Does that mean that it never starts or exits with an error? You should be able to see which is the case by following the kubernetes events in the given namespace: kubectl -n plex get events -w (assuming your namespace is plex) and then trying to transcode something.

I see 3 possible scenarios here:

  1. The pod fails to schedule due for some reason
  2. The pod schedules but crashes immediately (no kube-plex output in plex logs)
  3. The pod schedules but the transcode itself crashes

If we can figure out which case we are hitting here then we should be able to resolve the problem too.

kulisau commented 1 year ago

I copied 3 movies to the NFS PV. One of them, the biggest one, didn't even trigger the transcoding pod but it was playing, albeit without HW acceleration. I'll paste ffprobe output of that file at the end of the post.

So I have two scenarios:

  1. resources configured (added igpu) as in the above values.yaml, the proper and suggested approach
  2. resources (not sure how to call it, but probably deployment resources) configured in bottom of the above values.yaml file.

With setup in 1st scenario, all files playback just fine, HW acceleration does not work except the big file, but it does not even trigger the transcoder pod.

2nd scenario, playback is possible if direct play only, except the big file, it's being transcoded WITH HW acceleration. Two smaller files (movies) triggers transcoding pod, it tries to start, crashes and eventually disappears from the list of pods inside the namespace.

Output of the kubectl -n plex get events -w command:


0s          Normal    ScalingReplicaSet   deployment/plex-kube-plex                Scaled up replica set plex-kube-plex-5d44d7fd48 to 1
0s          Normal    Scheduled           pod/plex-kube-plex-5d44d7fd48-nqk9t      Successfully assigned plex/plex-kube-plex-5d44d7fd48-nqk9t to kube-02
0s          Normal    SuccessfulCreate    replicaset/plex-kube-plex-5d44d7fd48     Created pod: plex-kube-plex-5d44d7fd48-nqk9t
0s          Normal    Pulling             pod/plex-kube-plex-5d44d7fd48-nqk9t      Pulling image "ghcr.io/ressu/kube-plex:latest"
0s          Normal    Pulled              pod/plex-kube-plex-5d44d7fd48-nqk9t      Successfully pulled image "ghcr.io/ressu/kube-plex:latest" in 570.663241ms (570.695995ms including waiting)
0s          Normal    Created             pod/plex-kube-plex-5d44d7fd48-nqk9t      Created container kube-plex-init
0s          Normal    Started             pod/plex-kube-plex-5d44d7fd48-nqk9t      Started container kube-plex-init
0s          Normal    Pulling             pod/plex-kube-plex-5d44d7fd48-nqk9t      Pulling image "plexinc/pms-docker:latest"
0s          Normal    Pulled              pod/plex-kube-plex-5d44d7fd48-nqk9t      Successfully pulled image "plexinc/pms-docker:latest" in 869.069343ms (869.093984ms including waiting)
0s          Normal    Created             pod/plex-kube-plex-5d44d7fd48-nqk9t      Created container plex
0s          Normal    Started             pod/plex-kube-plex-5d44d7fd48-nqk9t      Started container plex
0s          Normal    SuccessfulCreate    job/pms-elastic-transcoder-7ddtv         Created pod: pms-elastic-transcoder-7ddtv-9fjb6
0s          Normal    Scheduled           pod/pms-elastic-transcoder-7ddtv-9fjb6   Successfully assigned plex/pms-elastic-transcoder-7ddtv-9fjb6 to kube-02
0s          Normal    Pulled              pod/pms-elastic-transcoder-7ddtv-9fjb6   Container image "ghcr.io/ressu/kube-plex@sha256:0566d16c9e0859c2c14cc8d825079c92fb099593296f0fcb447ad410840f5ea4" already present on machine
0s          Normal    Created             pod/pms-elastic-transcoder-7ddtv-9fjb6   Created container kube-plex-init
0s          Normal    Started             pod/pms-elastic-transcoder-7ddtv-9fjb6   Started container kube-plex-init
0s          Normal    Pulled              pod/pms-elastic-transcoder-7ddtv-9fjb6   Container image "docker.io/plexinc/pms-docker@sha256:77d87bfcf890647f358da12b81417d364cf65a24577e1aa280ea6c27c67dc90c" already present on machine
0s          Normal    Created             pod/pms-elastic-transcoder-7ddtv-9fjb6   Created container plex
0s          Normal    Started             pod/pms-elastic-transcoder-7ddtv-9fjb6   Started container plex
0s          Normal    SuccessfulCreate    job/pms-elastic-transcoder-fcrwt         Created pod: pms-elastic-transcoder-fcrwt-6qpbq
0s          Normal    Scheduled           pod/pms-elastic-transcoder-fcrwt-6qpbq   Successfully assigned plex/pms-elastic-transcoder-fcrwt-6qpbq to kube-03
0s          Normal    Killing             pod/pms-elastic-transcoder-7ddtv-9fjb6   Stopping container plex
0s          Normal    Pulled              pod/pms-elastic-transcoder-fcrwt-6qpbq   Container image "ghcr.io/ressu/kube-plex@sha256:0566d16c9e0859c2c14cc8d825079c92fb099593296f0fcb447ad410840f5ea4" already present on machine
0s          Normal    Created             pod/pms-elastic-transcoder-fcrwt-6qpbq   Created container kube-plex-init
0s          Normal    Started             pod/pms-elastic-transcoder-fcrwt-6qpbq   Started container kube-plex-init
0s          Normal    Pulled              pod/pms-elastic-transcoder-fcrwt-6qpbq   Container image "docker.io/plexinc/pms-docker@sha256:77d87bfcf890647f358da12b81417d364cf65a24577e1aa280ea6c27c67dc90c" already present on machine
0s          Normal    Created             pod/pms-elastic-transcoder-fcrwt-6qpbq   Created container plex
0s          Normal    Started             pod/pms-elastic-transcoder-fcrwt-6qpbq   Started container plex

ffprobe output of the big file:

ffprobe version 4.1.11-0+deb10u1 Copyright (c) 2007-2023 the FFmpeg developers
  built with gcc 8 (Debian 8.3.0-6)
  configuration: --prefix=/usr --extra-version=0+deb10u1 --toolchain=hardened --libdir=/usr/lib/aarch64-linux-gnu --incdir=/usr/include/aarch64-linux-gnu --arch=arm64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 22.100 / 56. 22.100
  libavcodec     58. 35.100 / 58. 35.100
  libavformat    58. 20.100 / 58. 20.100
  libavdevice    58.  5.100 / 58.  5.100
  libavfilter     7. 40.101 /  7. 40.101
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  3.100 /  5.  3.100
  libswresample   3.  3.100 /  3.  3.100
  libpostproc    55.  3.100 / 55.  3.100
Input #0, matroska,webm, from 'Big file Dolby Atmos.mkv':
  Metadata:
    encoder         : libebml v1.3.3 + libmatroska v1.4.4
    creation_time   : 2016-05-24T18:40:37.000000Z
  Duration: 01:54:27.17, start: 0.000000, bitrate: 33190 kb/s
    Stream #0:0: Video: hevc (Main), yuv420p(tv, bt709), 3840x1598, SAR 1:1 DAR 1920:799, 23.98 fps, 23.98 tbr, 1k tbn, 23.98 tbc (default)
    Metadata:
      BPS             : 24813429
      BPS-eng         : 24813429
      DURATION        : 01:54:25.317000000
      DURATION-eng    : 01:54:25.317000000
      NUMBER_OF_FRAMES: 164603
      NUMBER_OF_FRAMES-eng: 164603
      NUMBER_OF_BYTES : 21294007492
      NUMBER_OF_BYTES-eng: 21294007492
      _STATISTICS_WRITING_APP: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_APP-eng: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_DATE_UTC: 2016-05-24 18:40:37
      _STATISTICS_WRITING_DATE_UTC-eng: 2016-05-24 18:40:37
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
    Stream #0:1(eng): Audio: truehd, 48000 Hz, 7.1, s32 (24 bit) (default)
    Metadata:
      BPS             : 5240223
      BPS-eng         : 5240223
      DURATION        : 01:54:27.138000000
      DURATION-eng    : 01:54:27.138000000
      NUMBER_OF_FRAMES: 8240565
      NUMBER_OF_FRAMES-eng: 8240565
      NUMBER_OF_BYTES : 4498166862
      NUMBER_OF_BYTES-eng: 4498166862
      _STATISTICS_WRITING_APP: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_APP-eng: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_DATE_UTC: 2016-05-24 18:40:37
      _STATISTICS_WRITING_DATE_UTC-eng: 2016-05-24 18:40:37
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
    Stream #0:2(eng): Audio: ac3, 48000 Hz, 5.1(side), fltp, 384 kb/s
    Metadata:
      BPS             : 384000
      BPS-eng         : 384000
      DURATION        : 01:54:27.136000000
      DURATION-eng    : 01:54:27.136000000
      NUMBER_OF_FRAMES: 214598
      NUMBER_OF_FRAMES-eng: 214598
      NUMBER_OF_BYTES : 329622528
      NUMBER_OF_BYTES-eng: 329622528
      _STATISTICS_WRITING_APP: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_APP-eng: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_DATE_UTC: 2016-05-24 18:40:37
      _STATISTICS_WRITING_DATE_UTC-eng: 2016-05-24 18:40:37
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
    Stream #0:3(cze): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s
    Metadata:
      BPS             : 448000
      BPS-eng         : 448000
      DURATION        : 01:54:27.136000000
      DURATION-eng    : 01:54:27.136000000
      NUMBER_OF_FRAMES: 214598
      NUMBER_OF_FRAMES-eng: 214598
      NUMBER_OF_BYTES : 384559616
      NUMBER_OF_BYTES-eng: 384559616
      _STATISTICS_WRITING_APP: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_APP-eng: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_DATE_UTC: 2016-05-24 18:40:37
      _STATISTICS_WRITING_DATE_UTC-eng: 2016-05-24 18:40:37
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
    Stream #0:4(hin): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s
    Metadata:
      BPS             : 448000
      BPS-eng         : 448000
      DURATION        : 01:54:27.136000000
      DURATION-eng    : 01:54:27.136000000
      NUMBER_OF_FRAMES: 214598
      NUMBER_OF_FRAMES-eng: 214598
      NUMBER_OF_BYTES : 384559616
      NUMBER_OF_BYTES-eng: 384559616
      _STATISTICS_WRITING_APP: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_APP-eng: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_DATE_UTC: 2016-05-24 18:40:37
      _STATISTICS_WRITING_DATE_UTC-eng: 2016-05-24 18:40:37
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
    Stream #0:5(hun): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s
    Metadata:
      BPS             : 448000
      BPS-eng         : 448000
      DURATION        : 01:54:27.136000000
      DURATION-eng    : 01:54:27.136000000
      NUMBER_OF_FRAMES: 214598
      NUMBER_OF_FRAMES-eng: 214598
      NUMBER_OF_BYTES : 384559616
      NUMBER_OF_BYTES-eng: 384559616
      _STATISTICS_WRITING_APP: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_APP-eng: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_DATE_UTC: 2016-05-24 18:40:37
      _STATISTICS_WRITING_DATE_UTC-eng: 2016-05-24 18:40:37
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
    Stream #0:6(pol): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s
    Metadata:
      BPS             : 448000
      BPS-eng         : 448000
      DURATION        : 01:54:27.136000000
      DURATION-eng    : 01:54:27.136000000
      NUMBER_OF_FRAMES: 214598
      NUMBER_OF_FRAMES-eng: 214598
      NUMBER_OF_BYTES : 384559616
      NUMBER_OF_BYTES-eng: 384559616
      _STATISTICS_WRITING_APP: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_APP-eng: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_DATE_UTC: 2016-05-24 18:40:37
      _STATISTICS_WRITING_DATE_UTC-eng: 2016-05-24 18:40:37
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
    Stream #0:7(rus): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s
    Metadata:
      BPS             : 448000
      BPS-eng         : 448000
      DURATION        : 01:54:27.136000000
      DURATION-eng    : 01:54:27.136000000
      NUMBER_OF_FRAMES: 214598
      NUMBER_OF_FRAMES-eng: 214598
      NUMBER_OF_BYTES : 384559616
      NUMBER_OF_BYTES-eng: 384559616
      _STATISTICS_WRITING_APP: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_APP-eng: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_DATE_UTC: 2016-05-24 18:40:37
      _STATISTICS_WRITING_DATE_UTC-eng: 2016-05-24 18:40:37
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
    Stream #0:8(tur): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s
    Metadata:
      BPS             : 448000
      BPS-eng         : 448000
      DURATION        : 01:54:27.136000000
      DURATION-eng    : 01:54:27.136000000
      NUMBER_OF_FRAMES: 214598
      NUMBER_OF_FRAMES-eng: 214598
      NUMBER_OF_BYTES : 384559616
      NUMBER_OF_BYTES-eng: 384559616
      _STATISTICS_WRITING_APP: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_APP-eng: mkvmerge v9.1.0 ('Little Earthquakes') 64bit
      _STATISTICS_WRITING_DATE_UTC: 2016-05-24 18:40:37
      _STATISTICS_WRITING_DATE_UTC-eng: 2016-05-24 18:40:37
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
kulisau commented 1 year ago

Just if by any chance that would helpful, I'm adding ffprobe info about the smaller file (movie) I was testing:


ffprobe version 4.1.11-0+deb10u1 Copyright (c) 2007-2023 the FFmpeg developers
  built with gcc 8 (Debian 8.3.0-6)
  configuration: --prefix=/usr --extra-version=0+deb10u1 --toolchain=hardened --libdir=/usr/lib/aarch64-linux-gnu --incdir=/usr/include/aarch64-linux-gnu --arch=arm64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 22.100 / 56. 22.100
  libavcodec     58. 35.100 / 58. 35.100
  libavformat    58. 20.100 / 58. 20.100
  libavdevice    58.  5.100 / 58.  5.100
  libavfilter     7. 40.101 /  7. 40.101
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  3.100 /  5.  3.100
  libswresample   3.  3.100 /  3.  3.100
  libpostproc    55.  3.100 / 55.  3.100
Input #0, matroska,webm, from 'Smaller file Bluray-1080p.mkv':
  Metadata:
    encoder         : libebml v1.3.4 + libmatroska v1.4.5
    creation_time   : 2018-01-29T15:00:59.000000Z
  Duration: 01:41:04.08, start: 0.000000, bitrate: 15928 kb/s
    Chapter #0:0: start 0.000000, end 211.002000
    Metadata:
      title           : 00:00:00.000
    Chapter #0:1: start 211.002000, end 448.740000
    Metadata:
      title           : 00:03:31.002
    Chapter #0:2: start 448.740000, end 917.458000
    Metadata:
      title           : 00:07:28.740
    Chapter #0:3: start 917.458000, end 1097.221000
    Metadata:
      title           : 00:15:17.458
    Chapter #0:4: start 1097.221000, end 1520.269000
    Metadata:
      title           : 00:18:17.221
    Chapter #0:5: start 1520.269000, end 2072.696000
    Metadata:
      title           : 00:25:20.269
    Chapter #0:6: start 2072.696000, end 2460.583000
    Metadata:
      title           : 00:34:32.696
    Chapter #0:7: start 2460.583000, end 2657.905000
    Metadata:
      title           : 00:41:00.583
    Chapter #0:8: start 2657.905000, end 3427.549000
    Metadata:
      title           : 00:44:17.905
    Chapter #0:9: start 3427.549000, end 3968.089000
    Metadata:
      title           : 00:57:07.549
    Chapter #0:10: start 3968.089000, end 4525.688000
    Metadata:
      title           : 01:06:08.089
    Chapter #0:11: start 4525.688000, end 4799.211000
    Metadata:
      title           : 01:15:25.688
    Chapter #0:12: start 4799.211000, end 5082.953000
    Metadata:
      title           : 01:19:59.211
    Chapter #0:13: start 5082.953000, end 5569.397000
    Metadata:
      title           : 01:24:42.953
    Chapter #0:14: start 5569.397000, end 5688.933000
    Metadata:
      title           : 01:32:49.397
    Chapter #0:15: start 5688.933000, end 6064.080000
    Metadata:
      title           : 01:34:48.933
    Stream #0:0: Video: h264 (High), yuv420p(progressive), 1920x1040 [SAR 1:1 DAR 24:13], 23.98 fps, 23.98 tbr, 1k tbn, 47.95 tbc (default)
    Metadata:
      BPS             : 13968084
      BPS-eng         : 13968084
      DURATION        : 01:41:04.058000000
      DURATION-eng    : 01:41:04.058000000
      NUMBER_OF_FRAMES: 145392
      NUMBER_OF_FRAMES-eng: 145392
      NUMBER_OF_BYTES : 10587909093
      NUMBER_OF_BYTES-eng: 10587909093
      _STATISTICS_WRITING_APP: mkvmerge v9.9.0 ('Pick Up') 64bit
      _STATISTICS_WRITING_APP-eng: mkvmerge v9.9.0 ('Pick Up') 64bit
      _STATISTICS_WRITING_DATE_UTC: 2018-01-29 15:00:59
      _STATISTICS_WRITING_DATE_UTC-eng: 2018-01-29 15:00:59
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
    Stream #0:1(lit): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s (default)
    Metadata:
      BPS             : 448000
      BPS-eng         : 448000
      DURATION        : 01:41:03.680000000
      DURATION-eng    : 01:41:03.680000000
      NUMBER_OF_FRAMES: 189490
      NUMBER_OF_FRAMES-eng: 189490
      NUMBER_OF_BYTES : 339566080
      NUMBER_OF_BYTES-eng: 339566080
      _STATISTICS_WRITING_APP: mkvmerge v9.9.0 ('Pick Up') 64bit
      _STATISTICS_WRITING_APP-eng: mkvmerge v9.9.0 ('Pick Up') 64bit
      _STATISTICS_WRITING_DATE_UTC: 2018-01-29 15:00:59
      _STATISTICS_WRITING_DATE_UTC-eng: 2018-01-29 15:00:59
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
    Stream #0:2(eng): Audio: dts (DTS), 48000 Hz, 5.1(side), fltp, 1536 kb/s
    Metadata:
      BPS             : 1509000
      BPS-eng         : 1509000
      DURATION        : 01:41:04.064000000
      DURATION-eng    : 01:41:04.064000000
      NUMBER_OF_FRAMES: 568506
      NUMBER_OF_FRAMES-eng: 568506
      NUMBER_OF_BYTES : 1143834072
      NUMBER_OF_BYTES-eng: 1143834072
      _STATISTICS_WRITING_APP: mkvmerge v9.9.0 ('Pick Up') 64bit
      _STATISTICS_WRITING_APP-eng: mkvmerge v9.9.0 ('Pick Up') 64bit
      _STATISTICS_WRITING_DATE_UTC: 2018-01-29 15:00:59
      _STATISTICS_WRITING_DATE_UTC-eng: 2018-01-29 15:00:59
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
    Stream #0:3(eng): Subtitle: subrip
    Metadata:
      BPS             : 66
      BPS-eng         : 66
      DURATION        : 01:37:00.939000000
      DURATION-eng    : 01:37:00.939000000
      NUMBER_OF_FRAMES: 1378
      NUMBER_OF_FRAMES-eng: 1378
      NUMBER_OF_BYTES : 48202
      NUMBER_OF_BYTES-eng: 48202
      _STATISTICS_WRITING_APP: mkvmerge v9.9.0 ('Pick Up') 64bit
      _STATISTICS_WRITING_APP-eng: mkvmerge v9.9.0 ('Pick Up') 64bit
      _STATISTICS_WRITING_DATE_UTC: 2018-01-29 15:00:59
      _STATISTICS_WRITING_DATE_UTC-eng: 2018-01-29 15:00:59
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
    Stream #0:4(eng): Subtitle: subrip
    Metadata:
      title           : SDH
      BPS             : 70
      BPS-eng         : 70
      DURATION        : 01:38:42.665000000
      DURATION-eng    : 01:38:42.665000000
      NUMBER_OF_FRAMES: 1520
      NUMBER_OF_FRAMES-eng: 1520
      NUMBER_OF_BYTES : 52035
      NUMBER_OF_BYTES-eng: 52035
      _STATISTICS_WRITING_APP: mkvmerge v9.9.0 ('Pick Up') 64bit
      _STATISTICS_WRITING_APP-eng: mkvmerge v9.9.0 ('Pick Up') 64bit
      _STATISTICS_WRITING_DATE_UTC: 2018-01-29 15:00:59
      _STATISTICS_WRITING_DATE_UTC-eng: 2018-01-29 15:00:59
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
      _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
ressu commented 1 year ago

Oh I see, I didn't notice the gpu declaration near the image on the top of values.yaml. You should remove that and keep the lower one.

The reason why this is, is a bit counterintuitive. When Kubernetes schedules a pod with GPU, it will reserve the GPU for that pod. Which means that further pods that want acceleration won't be scheduled as there are no GPUs available. What makes this counterintuitive is the way how the pods are scheduled. The upper definition (where you have your GPU declaration at the moment) is for the Plex pod itself. But since the transcoder is being run on a separate pod, you don't want Plex to have GPU acceleration.

The lower definition is rewritten into a kube-plex annotation, which kube-plex then copies to the transcoding pod that is created when transcoding happens. There is no way of telling Kubernetes to share a GPU that I know of, which means that clusters with a single GPU will end up with unschedulable transcoding jobs when Plex is running.

In the end, I don't see a huge value in running kube-plex in a cluster with just a single GPU for transcoding as you can ever only transcode a single stream at a time, while native Plex will be able to fall back to non-GPU transcoding.

kulisau commented 1 year ago

So the template is misleading, it suggests to define the GPU here instead of the bottom section. It also suggests that the declaration should be applied to the kubePlex (at least to my limited k8s knowledge) instead of the plex itself.

Saying that the cluster has a single GPU isn't exactly true. I have 3 nodes in the cluster and all of them are sharing their own GPU:

NAME                              READY   STATUS    RESTARTS        AGE     IP            NODE      NOMINATED NODE   READINESS GATES
intel-gpu-plugin-prt7z            1/1     Running   1 (5d14h ago)   5d15h   10.42.0.149   kube-01   <none>           <none>
intel-gpu-plugin-hkj7t            1/1     Running   1 (5d14h ago)   5d15h   10.42.2.253   kube-03   <none>           <none>
intel-gpu-plugin-q4fdf            1/1     Running   1 (5d14h ago)   5d15h   10.42.1.75    kube-02   <none>           <none>
plex-kube-plex-5d44d7fd48-nqk9t   1/1     Running   0               11h     10.42.1.191   kube-02   <none>           <none>

Since even the lowest end Celeron iGPU is capable of transcoding several streams simultaneously, maybe you'd consider reusing the pod for set amount of streams?

Most of the times a transcoding pod is being executed in a separate node anyways, so even if the theory of inability of sharing GPU has some grounds, it does not seem to be the case here.

If the plex pod itself wouldn't have HW transcoding capability, it wouldn't "tell" the transcoding pod to transcode with HW acceleration, right?

P.S. any idea why the Big file does not even trigger the transcoding pod and runs on the plex itself?

kulisau commented 1 year ago

After further inspection it seems that you were right. Most of the time transcoder pod being deployed on the same node as the Plex one. Sometimes it tries to recreate it on a different k8s node, but the result is the same - crash.

ressu commented 1 year ago

So the template is misleading, it suggests to define the GPU here instead of the bottom section. It also suggests that the declaration should be applied to the kubePlex (at least to my limited k8s knowledge) instead of the plex itself.

Whoops! You are right. the location you have is correct. My mistake.

https://github.com/ressu/kube-plex/blob/48f44b0e73c0316b0c2ca95d5e07d880f4913f4e/charts/kube-plex/templates/deployment.yaml#L37-L42

Saying that the cluster has a single GPU isn't exactly true. I have 3 nodes in the cluster and all of them are sharing their own GPU: [..]]

Ok, in that case you shouldn't have issues with transcoding. I've seen a few cases where a single GPU has been the bottle neck, which is why I jumped to that conclusion.

Since even the lowest end Celeron iGPU is capable of transcoding several streams simultaneously, maybe you'd consider reusing the pod for set amount of streams?

This is something that I've been thinking about when playing around with the idea of pre-creating the transcoder pods for performance reasons. The current model works nicely and is very low overhead, but is a bit complex when it comes to pod management.

If the plex pod itself wouldn't have HW transcoding capability, it wouldn't "tell" the transcoding pod to transcode with HW acceleration, right?

The hardware transcoding is defined as a flag in Plex configuration. The main process doesn't try to detect whether GPU is available runtime. At least this used to be the case. So it shouldn't be an issue that the main process doesn't have a GPU assigned.

P.S. any idea why the Big file does not even trigger the transcoding pod and runs on the plex itself?

There are a few corner cases where transcoding is not relayed to kube-plex. EAC is one of those corner cases. EAC makes heavy use of temporary files to communicate things between the main process and transcoder, so unless we find a way to sync those temporary files between the main process and transcoders, there aren't really any useful ways to handle those files with kube-plex

kulisau commented 1 year ago

I tried to assign the plex pod to the kube-03 node to test your theory about pods being unable to share the GPU. Transcoder launched on the kube-02, the movie played back, but without HW acc. Let's consider a hypothetical situation: HW transcoding is working, the main plex pod does not have an assigned GPU, and the transcoding pod has one. Would it still report to the GUI that video is being transcoded with HW acc?

I did a quick Google and found out that at least Intel supports fractional sharing of GPU per multiple containers: https://intel.github.io/intel-device-plugins-for-kubernetes/cmd/gpu_plugin/README.html#operation-modes-for-different-workload-types

I'll try the shared mode and let you know if there will be any positive outcome...

erustyrannus commented 5 months ago

Hi, I am facing the same issue, did you find a solution? Not sure if there is something off with the values I am using?

Values `image: repository: plexinc/pms-docker tag: latest pullPolicy: Always kubePlex: enabled: true loglevel: "" image: repository: ghcr.io/ressu/kube-plex tag: latest pullPolicy: Always resources: requests: gpu.intel.com/i915: "2" limits: gpu.intel.com/i915: "2" mounts: /data,/transcode claimToken: "" timezone: Redacted extraEnv: puid: 1000 pgid: 1000 service: type: LoadBalancer port: 32400 annotations: {} labels: {} loadBalancerIP: 10.0.0.98 probes: liveness: enabled: true custom: false spec: initialDelaySeconds: 0 periodSeconds: 10 timeoutSeconds: 1 failureThreshold: 3 readiness: enabled: true custom: false spec: initialDelaySeconds: 0 periodSeconds: 10 timeoutSeconds: 1 failureThreshold: 3 startup: enabled: true custom: false spec: initialDelaySeconds: 0 timeoutSeconds: 1 periodSeconds: 5 failureThreshold: 30 ingress: #Redacted rbac: create: true nodeSelector: kubernetes.io/arch: amd64 persistence: transcode: claimName: kube-plex-transcode accessMode: ReadWriteMany data: claimName: kube-plex-data accessMode: ReadWriteMany config: claimName: kube-plex-config accessMode: ReadWriteMany resources: {} podAnnotations: {} deploymentAnnotations: {} proxy: enable: false tolerations: [] affinity: {} `