wnielson / Plex-Remote-Transcoder

A distributed transcoding backend for Plex
MIT License
646 stars 58 forks source link

plex_transcoder issues #18

Open kcconnor opened 8 years ago

kcconnor commented 8 years ago

Hello:

I've installed PRT between a Gigabyte BRIX running Plex as master, and a VirtualBox Ubuntu environment on my laptop.

Media is stored on a Synology DS214play NAS. The BRIX and the NAS are on the same gigabit switch, and the VM is connecting via 802.11n wireless at 300mbit. The BRIX imports the media library via NFS. Transcode temp is located on the NAS, and the BRIX exports /var/lib/plexmediaserver and /usr/lib/plexmediaserver via NFS. The VM mounts all 4 NFS imports. UIDs for the plex user across all 3 devices match, and ssh is seamless for the plex user.

Prior to installing PRT on the BRIX, that little Celeron-based device could handle transcoding of 1-2 sessions at a time with no issues.

When I have the VM environment turned on, transcoding keeps up with demand and I can run about 4 simultaneous transcode sessions before overloading the VM (I have 3 cores assigned on the host laptop's i7 CPU).

However, when I turn off the VM and the BRIX has no external transcoder to rely upon, things get interesting. Plex/PRT winds up spawning multiple instances of plex_transcoder (anywhere from 2 to 6). Those instances do not keep up with demand of the viewing application. When I stop and close a viewing session on the plex server, the instances of plex_transcoder do not exit and reside as the most CPU intensive resources on the system, leaving 0% idle. If I jump forward or backwards in the timeline of a video (which commands the transcoder to work on a different segment of the media), the old plex_transcoder are not ended and new ones are spawned, starving the system of CPU resources.

Also... if I turn off the PRT slave VM while someone is watching something transcoded by it, the user reaches the end of the portion of the media that has been transcoded and then hangs indefinitely, which indicates to me that the cluster functionality of PRT does not take into account failure of a slave to finish a transcoded portion.

This condition happens if the PRT Master is not listed at all in the prt.conf as a slave, or if it is listed as 127.0.0.1 or its proper IP address.

If I return plex_transcoder back to its proper name, Plex is able to stop transcode sessions when the user stops viewing that particular media.

I suspect there are status and/or exit codes from Plex New Transcoder that are not being reported back to the Plex Media Server by the PRT replacement wrapper, or PMS is able to track PIDs and kills them when unwanted on a localhost but cannot do so through plex_transcoder nor do it via SSH on a PRT slave.

My ultimate goal with PRT is to leave the BRIX on as a low wattage Plex Master, and have a script running periodically to see if transcode tasks on the BRIX merit sending a WoL packet to a larger box with more CPU resources, and having transcoding shift via the prt get_cluster_load functionality once the larger box is awake. Then when transcode duties are idle on the big box, it goes back to hibernation.

kcconnor commented 8 years ago

Update: It seems to only spawn these odd issues early on during a particular computer's uptime, or if a slave becomes unreachable. I'm thinking that the stats registered by "prt get_cluster_load" might be unhandled if they are blank or null, maybe?