pabloromeo / clusterplex

ClusterPlex is an extended version of Plex, which supports distributed Workers across a cluster to handle transcoding requests.
MIT License
470 stars 36 forks source link

Remote transcoding not playing back but still happening #82

Closed ednxzu closed 3 years ago

ednxzu commented 3 years ago

Describe the bug When remote transcoding, everything seems to be running fine, but the actual pms never plays back content even tho worrkers seem to transcode properly.

To Reproduce Brand new install with recommended setup, and NFS share mounted across all swarm nodes

Desktop (please complete the following information):

Additional context When starting a playback that is supposed to be remotely transcoded, web player loads to infinity, while worker is transcoding and orchestrator waits on in progress status :

LOGS FROM PMS :

Calling external transcoder: /app/transcoder.js
ON_DEATH: debug mode enabled for pid [561]
Setting VERBOSE to ON
Sending request to orchestrator on: http://plex-orchestrator:3500
cwd => "/tmp/transcode/Transcode/Sessions/plex-transcode-dgu72pnrfonpbdfugze6624a-0144862e-968f-4e4b-8e7c-f4b857ba0813"
args => ["-codec:0","hevc","-analyzeduration","20000000","-probesize","20000000","-i","/data/tv/blue.mountain.state.s02e02.the.fingering.webdl-1080p.x265.mkv","-filter_complex","[0:0]scale=w=1920:h=1080[0];[0]format=pix_fmts=yuv420p|nv12[1]","-map","[1]"
,"-codec:0","libx264","-crf:0","16","-maxrate:0","16866k","-bufsize:0","33732k","-r:0","23.975999999999999","-preset:0","veryfast","-x264opts:0","subme=0:me_range=4:rc_lookahead=10:me=dia:no_chroma_me:8x8dct=0:partitions=none","-force_key_frames:0","expr
:gte(t,n_forced*1)","-map","0:1","-metadata:s:1","language=eng","-codec:1","copy","-copypriorss:1","0","-f","dash","-seg_duration","1","-dash_segment_type","mp4","-init_seg_name","init-stream$RepresentationID$.m4s","-media_seg_name","chunk-stream$Represe
ntationID$-$Number%05d$.m4s","-window_size","5","-delete_removed","false","-skip_to_segment","1","-time_delta","0.0625","-manifest_name","http://192.168.1.110:32400/video/:/transcode/session/dgu72pnrfonpbdfugze6624a/0144862e-968f-4e4b-8e7c-f4b857ba0813/m
anifest?X-Plex-Http-Pipeline=infinite","-avoid_negative_ts","disabled","-map_metadata","-1","-map_chapters","-1","dash","-start_at_zero","-copyts","-vsync","cfr","-y","-nostats","-loglevel","verbose","-loglevel_plex","verbose","-progressurl","http://192.
168.1.110:32400/video/:/transcode/session/dgu72pnrfonpbdfugze6624a/0144862e-968f-4e4b-8e7c-f4b857ba0813/progress"]
env => {"PUID":"1000","PLEX_ARCH":"amd64","HOSTNAME":"7e92b15cc140","LANGUAGE":"en_US.UTF-8","ALLOWED_NETWORKS":"192.168.1.0/24","TRANSCODE_OPERATING_MODE":"both","ADVERTISE_IP":"\"http://192.168.1.110,http://192.168.1.111\"","ORCHESTRATOR_URL":"http://p
lex-orchestrator:3500","PWD":"/tmp/transcode/Transcode/Sessions/plex-transcode-dgu72pnrfonpbdfugze6624a-0144862e-968f-4e4b-8e7c-f4b857ba0813","PLEX_DOWNLOAD":"https://downloads.plex.tv/plex-media-server-new","PLEX_MEDIA_SERVER_MAX_PLUGIN_PROCS":"6","NVID
IA_DRIVER_CAPABILITIES":"compute,video,utility","PMS_IP":"192.168.1.110","PLEX_PASS":"no","TZ":"Europe/Paris","PLEX_MEDIA_SERVER_USER":"abc","HOME":"/root","LANG":"en_US.UTF-8","PGID":"1000","TERM":"xterm","PLEX_MEDIA_SERVER_INFO_VENDOR":"Docker","PLEX_M
EDIA_SERVER_HOME":"/usr/lib/plexmediaserver","X_PLEX_TOKEN":"Ng6cCEFzXNxM3HeqzoD_","PLEX_MEDIA_SERVER_INFO_MODEL":"x86_64","SHLVL":"0","LD_LIBRARY_PATH":"/usr/lib","PLEX_MEDIA_SERVER_INFO_PLATFORM_VERSION":"5.4.0-88-generic","LIBVA_DRIVERS_PATH":"/usr/li
b/plexmediaserver/lib/dri","PLEX_MEDIA_SERVER_APPLICATION_SUPPORT_DIR":"/config/Library/Application Support","CWD":"/","PATH":"/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin","VERSION":"docker","DEBIAN_FRONTEND":"noninteractive","PLEX_MEDIA
_SERVER_INFO_DEVICE":"Docker Container (LinuxServer.io)","FFMPEG_EXTERNAL_LIBS":"/config/Library/Application\\ Support/Plex\\ Media\\ Server/Codecs/be22e26-4019-linux-x86_64/","TRANSCODER_VERBOSE":"1"}
JobPoster connected, announcing
Orchestrator requesting pending work
Sending request to orchestrator on: http://plex-orchestrator:3500

LOGS FROM ORCHESTRATOR :

Setting up websockets
Ready
Server listening on port 3500
Client connected: oK67sucJbsQwocZsAAAB
Registering worker 8eb9e5a4-5fba-4091-86a8-1bbeac1e85f8|plex-worker-cluster01
Registered new worker: 8eb9e5a4-5fba-4091-86a8-1bbeac1e85f8|plex-worker-cluster01
Client connected: Bb20m07I7N83EiCuAAAD
Registering worker bad06902-7ded-4c15-b59b-7168008ee8f5|plex-worker-cluster02
Registered new worker: bad06902-7ded-4c15-b59b-7168008ee8f5|plex-worker-cluster02
Client connected: qU6HO7VqZezNrD2OAAAF
Registered new job poster: bdd2fe9f-3536-4259-bfb1-c065484461dc|7e92b15cc140
Creating single task for the job
Queueing job 30d7ffbc-ac02-4e10-9331-68f6d2a5852a
Queueing task fd7e9be7-cee4-4c40-af0c-5be6ed64bb80
Running task fd7e9be7-cee4-4c40-af0c-5be6ed64bb80
Forwarding work request to bad06902-7ded-4c15-b59b-7168008ee8f5|plex-worker-cluster02
Received update for task fd7e9be7-cee4-4c40-af0c-5be6ed64bb80, status: received
Received update for task fd7e9be7-cee4-4c40-af0c-5be6ed64bb80, status: inprogress

LOGS FROM WORKER :

[Parsed_scale_0 @ 0x7f8eac4db480] w:1920 h:1080 flags:'bilinear' interl:0
Stream mapping:
  Stream #0:0 (hevc) -> scale
  format -> Stream #0:0 (libx264)
  Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for help
[Parsed_scale_0 @ 0x7f8ea8cef140] w:1920 h:1080 flags:'bilinear' interl:0
[graph 0 input from stream 0:0 @ 0x7f8ea8cefc00] w:1920 h:1080 pixfmt:yuv420p10le tb:1/1000 fr:24000/1001 sar:1/1 sws_param:flags=2
[Parsed_scale_0 @ 0x7f8ea8cef140] w:1920 h:1080 fmt:yuv420p10le sar:1/1 -> w:1920 h:1080 fmt:yuv420p sar:1/1 flags:0x2
[libx264 @ 0x7f8eac41d940] using SAR=1/1
[libx264 @ 0x7f8eac41d940] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x7f8eac41d940] profile Main, level 4.1, 4:2:0, 8-bit
[libx264 @ 0x7f8eac41d940] 264 - core 161 - H.264/MPEG-4 AVC codec - Copyleft 2003-2020 - http://www.videolan.org/x264.html - options: cabac=1 ref=1 deblock=1:0:0 analyse=0x1:0 me=dia subme=0 psy=1 psy_rd=1.00:0.00 mixed_ref=0 me_range=4 chroma_me=0 trel
lis=0 8x8dct=0 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=0 threads=9 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 wei
ghtp=1 keyint=250 keyint_min=23 scenecut=40 intra_refresh=0 rc_lookahead=10 rc=crf mbtree=1 crf=16.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 vbv_maxrate=16866 vbv_bufsize=33732 crf_max=0.0 nal_hrd=none filler=0 ip_ratio=1.40 aq=1:1.00
[dash @ 0x7f8eac424040] No bit rate set for stream 0
[dash @ 0x7f8eac424040] Opening 'init-stream0.m4s' for writing
[mp4 @ 0x7f8eac4259c0] Empty MOOV enabled; disabling automatic bitstream filtering
[dash @ 0x7f8eac424040] Representation 0 init segment will be written to: init-stream0.m4s
[dash @ 0x7f8eac424040] No bit rate set for stream 1
[dash @ 0x7f8eac424040] Opening 'init-stream1.m4s' for writing
[mp4 @ 0x7f8eac425380] Empty MOOV enabled; disabling automatic bitstream filtering
[dash @ 0x7f8eac424040] Representation 1 init segment will be written to: init-stream1.m4s
Output #0, dash, to 'dash':
  Metadata:
    encoder         : Lavf58.27.104
    Stream #0:0: Video: h264 (libx264), 1 reference frame, yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 23.98 fps, 11988 tbn, 23.98 tbc (default)
    Metadata:
      encoder         : Lavc58.52.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 16866000/0/0 buffer size: 33732000 vbv_delay: -1
    Stream #0:1(eng): Audio: aac (LC), 48000 Hz, stereo, fltp
[AVIOContext @ 0x7f8e9c543400] Statistics: 0 seeks, 1 writeouts
[dash @ 0x7f8eac424040] Opening 'chunk-stream0-00001.m4s.tmp' for writing
[AVIOContext @ 0x7f8e9c543bc0] Statistics: 0 seeks, 1 writeouts
[dash @ 0x7f8eac424040] Opening 'chunk-stream1-00001.m4s.tmp' for writing
[AVIOContext @ 0x7f8e9c543400] Statistics: 0 seeks, 1 writeouts
[dash @ 0x7f8eac424040] Representation 0 media segment 2 written to: chunk-stream0-00001.m4s
[AVIOContext @ 0x7f8e9b1a4d00] Statistics: 0 seeks, 1 writeouts
[dash @ 0x7f8eac424040] Representation 1 media segment 2 written to: chunk-stream1-00001.m4s
[dash @ 0x7f8eac424040] Opening 'http://192.168.1.110:32400/video/:/transcode/session/g1by5i2qpuntsb0sdgyarimg/4405e0c3-d6a8-40b1-8eb9-c0241875322f/manifest?X-Plex-Http-Pipeline=infinite' for writing
[tcp @ 0x7f8eac41b9c0] Starting connection attempt to 192.168.1.110 port 32400
[tcp @ 0x7f8eac41b9c0] Successfully connected to 192.168.1.110 port 32400
[AVIOContext @ 0x7f8eac43c2c0] Statistics: 0 seeks, 1 writeouts
[dash @ 0x7f8eac424040] Opening 'chunk-stream0-00002.m4s.tmp' for writing
[dash @ 0x7f8eac424040] Opening 'chunk-stream1-00002.m4s.tmp' for writing

And of course the stack file

version: '3.9'

services:
  plex:
    image: ghcr.io/pabloromeo/clusterplex_pms:latest #ghcr.io/linuxserver/plex:latest
    deploy:
      mode: replicated
      replicas: 1
      labels:
          - "traefik.enable=true"
          - "traefik.http.routers.plex.entrypoints=https"
          - "traefik.http.routers.plex.tls=true"
          - "traefik.http.routers.plex.rule=Host(`plextest.ednz.fr`)"
          - "traefik.http.services.plex.loadbalancer.server.port=32400"
    environment:
            #- DOCKER_MODS=ghcr.io/pabloromeo/clusterplex_dockermod:latest
      - VERSION=docker
      - PUID=1000
      - PGID=1000
      - UMASK=002
      - TZ=Europe/Paris
      - ADVERTISE_IP="http://192.168.1.110,http://192.168.1.111"
      - ALLOWED_NETWORKS=192.168.1.0/24
      - PLEX_PASS=no
      - ORCHESTRATOR_URL=http://plex-orchestrator:3500
      - PMS_IP=192.168.1.110
      - TRANSCODE_OPERATING_MODE=remote #(local|remote|both)
      - TRANSCODER_VERBOSE=1   # 1=verbose, 0=silent
    healthcheck:
      test: curl -fsS http://localhost:32400/identity > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 30s
    volumes:
      - ./config:/config
      - ./backups:/backups
      - ./tv:/data/tv
      - ./movies:/data/movies
      - ./transcode:/tmp
      - /etc/localtime:/etc/localtime:ro
    ports:
      - 32469:32469
      - 32400:32400
      - 3005:3005
      - 8324:8324
      - 1900:1900/udp
      - 32410:32410/udp
      - 32412:32412/udp
      - 32413:32413/udp
      - 32414:32414/udp
    networks:
      - traefik-public
      - internal

  plex-orchestrator:
    image: ghcr.io/pabloromeo/clusterplex_orchestrator:latest
    deploy:
      mode: replicated
      replicas: 1
      update_config:
        order: start-first
    environment:
      - TZ=Europe/Paris
      - STREAM_SPLITTING=OFF # ON | OFF (default)
      - LISTENING_PORT=3500
      - WORKER_SELECTION_STRATEGY=LOAD_CPU # RR | LOAD_CPU | LOAD_TASKS | LOAD_RANK (default)
    healthcheck:
      test: curl -fsS http://localhost:3500/health > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 30s
    volumes:
      - /etc/localtime:/etc/localtime:ro
    ports:
      - 3500:3500
    networks:
      - internal

  plex-worker:
    image: ghcr.io/pabloromeo/clusterplex_worker:latest #ghcr.io/linuxserver/plex:latest
    hostname: "plex-worker-{{.Node.Hostname}}"
    deploy:
      mode: global
      update_config:
        order: start-first
    environment:
            #- DOCKER_MODS=ghcr.io/pabloromeo/clusterplex_worker_dockermod:latest
      - VERSION=docker
      - PUID=1000
      - PGID=1000
      - TZ=Europe/Paris
      - LISTENING_PORT=3501      # used by the healthcheck
      - STAT_CPU_INTERVAL=2000   # interval for reporting worker load metrics
      - ORCHESTRATOR_URL=http://plex-orchestrator:3500
    healthcheck:
      test: curl -fsS http://localhost:3501/health > /dev/null || exit 1
      interval: 15s
      timeout: 15s
      retries: 5
      start_period: 240s
    volumes:
      - ./codecs:/codecs # (optional, can be used to share codecs)
      - ./tv:/data/tv
      - ./movies:/data/movies
      - ./transcode:/tmp
      - /etc/localtime:/etc/localtime:ro
    networks:
      - internal

networks:
  internal:
    driver: overlay
  traefik-public:
    external: true

All this time the CPU load crearly shows that the worker is in fact transcoding. Thanks for everything so far this project is insanely interesting !!!

pabloromeo commented 3 years ago

Did you configure the Transcode path in Plex's configuration as mentioned in the readme? Meaning:

image

From the logs it seems that the Worker is indeed transcoding and also reporting progress back to Plex. My suspicion is that Plex is not finding or being able to read the transcoded segments which are written to the trascoding shared path.

Can you enable Plex logging in the configuration and monitor the output in Plex's Console? There should be errors or warnings there popping up.

ednxzu commented 3 years ago

image

Well I must say I'm really confused lol. Plex is logging that everything works fine when it just loads the thing to infinity.. the only error I'm getting is when I stop the transcode because plex tells me that it can't remove the transcode folder due to it not being empty..

pabloromeo commented 3 years ago

That's odd! Have you confirmed that playback does indeed work when running without the clusterplex dockermod or using the standard linuxserver plex image? You can just comment out the dockermod or change the base image, everything else can stay the same.

If you try playing content in a browser with DevTools open, what sort of errors or network status codes occur when requesting segments?

ednxzu commented 3 years ago

It really is ! The playback works fine if no mods are on and even with the custom images if pms is handling transcode (transcode mode : both). however the network console is giving me some 404 on header for things that look like transcode chunks ??(see screenshot) image and some other error cause I dont have a token but that shouldnt be an issue, the weird ones really are the 404 on headers.

image

ednxzu commented 3 years ago

and this is from my LAN, I don't get errors (they might be caused by cloudflare proxy then) image

pabloromeo commented 3 years ago

So from your LAN you get no errors but playback still doesn't begin?

I'd say be careful with proxying Plex traffic through CF, I believe proxying video is against their T&C and they tend to block accounts that do.

It's pretty puzzling, I mean, ultimately clusterplex doesn't really do anything in regards to playback, Plex should be picking up the segments from the shared path and streaming those.

Could you try the "Convert" functionality as a test? Meaning getting any random media and Optimizing it for another media type, just to see if that conversion indicates progress and finishes correctly? That one should use the remote transcoder as well.

ednxzu commented 3 years ago

From LAN I dont get any errors while playing transcoded content, but GET requests to the transcoded chunks just never post any status (see screenshot). From Internet (without cloudflare in between) I get header 404 and same playback problem where GET on chunks never give a result. image

It's probbly the weirdest problem I've had in a long time lol I don't even have any idea what might be going on... everything looks fine, but yet it's not..

ednxzu commented 3 years ago

And direct play works fine image

pabloromeo commented 3 years ago

So weird! Now, those GET requests with no status have no info at all there in devtools? Could it be a permissions issue when PMS is attempting to read the chunk from /tmp? any chance to grant +777 to the source of the /tmp mount?

ednxzu commented 3 years ago

The /tmp is already 777. GET request have nothing except a header (following screenshot) image

and a stack trace that doesnt tell much image

ednxzu commented 3 years ago

I just noticed chunks are generated with 644 perms is that normal behaviour ?

pabloromeo commented 3 years ago

Yup, that looks about right: image

Have you tried connecting through https://app.plex.tv/ instead of the LAN IP? To see if there's any difference in how the player behaves.

In my Network section of Plex's configuration I have defined several settings, such as LAN Network with my real LAN subnets and a few docker subnets, as well as the Custom server access URLs (i've included my custom domain there, as well as the LAN ip url). Also in my network i've configured "List of IP addresses and networks that are allowed without auth" to the following (my LAN and docker network subnets): 192.168.0.0/16,10.0.0.0/16,10.255.0.0/16,172.18.0.0/24 I haven't documented those since I believe they aren't mandatory for it to work, but reviewing the config might be worth a try.

See https://github.com/pabloromeo/clusterplex/issues/31#issuecomment-717208287 to see if this user's solution works.

ednxzu commented 3 years ago

I kept trying to figure out what I was missing but I can't for the sake of me, make it work. I'm just going to give up for now. You can close this ticket or leave it open if someone's facing a similar issue and wants to share it here.

Best,

pabloromeo commented 3 years ago

Sorry to hear about that! I'll be closing it for now, but will use it as reference in case someone is having a similar issue. Thanks!