brandan-schmitz / plexbot

Plexbot - A discord bot for automating movie libraries.
0 stars 0 forks source link

Implement video encoding/optimization #6

Closed brandan-schmitz closed 3 years ago

brandan-schmitz commented 3 years ago

Create a command that will trigger a process to import video files in a given directory. The directory will have a structure like the following:

processing-queue
|- tv
|   |-  'Arrow - s01e05.mkv'
|   |-  'Game of Thrones - s07e05.mov'
|
|- movies
    |-  'tt4154756.mkv'
    |-  'tt1477834.mkv'

The command should accept 1 (optional) parameter, and that is what the user wishes to optimize. It should be either tv or movies. If a user enters "tv" then the bot should start processing anything in the TV folder, likewise for movies if they enter "movies". If a user does not enter any parameter, then it should default to optimizing everything in both folders.

The workflow of this command should be something similar to the following and should utilize the bot work pool. Everything below should be executed within 1 job in the work pool, as the handbrake encode should only be allowed to process one media file at a time.

  1. User requests media optimization through a command: !import <optional parameter>

  2. Bot fetches a list of all the files in the directories it needs to scan.

  3. Bot matches media to their proper IMDB items. This can be done with the OMDB API. When scanning movies, simply use the IMDB ID that is the filename of the movie. However, when scanning TV shows, it will need to parse the filename to extract the relevant information. Such information is included as follows:

    • TV Show Name: Anything before the dash in the filename: Arrow - s01e05.mkv, would be a show name of Arrow.
    • Season number: This should be the series of numbers directly following the s in the filename. If there are zeros at the begining of the number, they should be removed during parsing: Arrow - s01e05.mkv, would be a season number of 1.
    • Episode number: Just like the season number, this should be any of the numbers directly following the e in the filename. Again, if any zeros are included in the begining of the number, they should be removed during parsing: Arrow - s01e05.mkv would be a episode number of 5.

    Once the bot has parsed the information from the name of the TV shows, it will need to first do a search for a series to find the IMDB ID of the series. When modeling this on the OMDB API, this request would look like this:

    http://www.omdbapi.com/?apikey=<APIKEY>&type=series&s=Arrow

    This would result in a list of shows matching the search criteria. From there, the bot will need to prompt the user to select the correct show for that media file, and once it has done that use the IMDB ID returned in the results from above to fetch the information about the episode.

    http://www.omdbapi.com/?apikey=<APIKEY>&i=tt2193021&season=1&episode=5
    {
     "Title": "Damaged",
     "Year": "2012",
     "Rated": "TV-14",
     "Released": "07 Nov 2012",
     "Season": "1",
     "Episode": "5",
     "Runtime": "45 min",
     "Genre": "Action, Adventure, Crime, Drama, Mystery, Sci-Fi",
     "Director": "Michael Schultz",
     "Writer": "Greg Berlanti (developed by), Marc Guggenheim (developed by), Andrew Kreisberg (developed by), Wendy Mericle, Ben Sokolowski",
     "Actors": "Stephen Amell, Katie Cassidy, Colin Donnell, David Ramsey",
     "Plot": "Oliver is accused of being the hooded archer and is put under house arrest. Also, he thinks back to his time on the island, where he first met Edward Fyers and Deathstroke.",
     "Language": "English",
     "Country": "USA",
     "Awards": "N/A",
     "Poster": "https://m.media-amazon.com/images/M/MV5BMTkxMzM0NTg4OF5BMl5BanBnXkFtZTcwMDc0MTU2OA@@._V1_SX300.jpg",
     "Ratings": [
       {
         "Source": "Internet Movie Database",
         "Value": "8.7/10"
       }
     ],
     "Metascore": "N/A",
     "imdbRating": "8.7",
     "imdbVotes": "4447",
     "imdbID": "tt2338426",
     "seriesID": "tt2193021",
     "Type": "episode",
     "Response": "True"
    }
  4. Build a list of the files to be processed. These will likely get mapped to a custom object and will need to contain the current filename, the name of the show, the season number, the episode number, the episode IMDB ID, and the episode title (from IMDB). If it is a movie, it will need to get the name of the movie, the year it was released, and the resolution of the current file (query the file itself for this).

  5. Go through each file in the media list created above and pass the information though a process builder to the HandBrakeCLI. An example of the command that should be invoked for the Arrow - s01e05.mkv file is below. Notice how it uses the information from the video files to properly place the show in the correct folder structure and name the file properly. Additionally it tells it to encode the file using the H.264 encoder with a quality of 22 and an audio bitrate of 192 with the ACC encoding.

    HandBrakeCLI -i /media/import-queue/tv/'Arrow - s01e05.mkv' -o /media/tv/'Arrow'/'Season 1'/'Arrow - s01e05 - Damaged.mp4' -e x264 -q 22 -B 192
  6. Monitor the output of the handbrake command by watching a stream of the output of the process. This is used to determine several things:

    • Was there an error
    • What is the current progress
    • What is the ETA
    • Notify the process when the encoding is done.

    This should also update the status message about the import with the current progress every 5 seconds.

  7. When the encoding process has finished, the bot should delete the original media file from the queue folder, and then add the information about the imported media to the bots database. This is where the information such as the episode and show IMDB ID, and media file resolution will be used.

brandan-schmitz commented 3 years ago

Make a custom wrapper for ffmpeg or handbrake similar to the Jaffree project but limit in scope to only be what is required and clean it up a bit. This should also implement a master/client type system to allow for distributed processing of media files. A new API server will need to be embedded into the bot to answer queries from client versions. The client should be within this same project but use a command implemented through PicoCLI to determine that it should only run as a client worker for processing media. This will require #3 to be implemented first.

brandan-schmitz commented 3 years ago

This should be run on a queue based system versus a command as originally planned. When a media file gets imported into the bot, it will add it to a optimize queue if the media file is not already in an optimized state or a preferred container type. Client nodes will automatically pickup work from this queue as they are able too.