pymedusa / Medusa

Automatic Video Library Manager for TV Shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic.
https://pymedusa.com
GNU General Public License v3.0
1.8k stars 276 forks source link

Post processor not matching show correctly (absolute episode numbering) #6025

Closed johnmaguire closed 5 years ago

johnmaguire commented 5 years ago

Describe the bug I have all seasons of a show (Xena: Warrior Princess) on my server, and am attempting to have the post processor create hard links to my TV Shows directory. Two seasons work correctly, and the other four do not. Here are the two formats:

Working: Xena - [02x01] - Orphan of War.mkv Not working: xena.e113.coming.home.ntsc.dvd.dd5.1.x264hi10-lightspeed.mkv

The associate log message is:

2019-01-16 18:30:09 WARNING  POSTPROCESSOR :: [918cfe7] Processing failed for /downloads/Downloads/completed/Xena.Warrior.Princess.S06.NTSC.DVD.DD5.1.X264Hi10-LiGHTSPEED/xena.e113.coming.home.ntsc.dvd.dd5.1.x264hi10-lightspeed.mkv: This show isn't in your list, you need to add it before post-processing an episode

I wondered if this was an issue with guessit, so I ran some tests:

(guessit) jmaguire@scorpion [06:22:17 PM] [~/repos/guessit/guessit] [develop]
-> % guessit 'xena.e113.coming.home.ntsc.dvd.dd5.1.x264hi10-lightspeed.mkv' -T 'Xena: Warrior Princess'
For: xena.e113.coming.home.ntsc.dvd.dd5.1.x264hi10-lightspeed.mkv
GuessIt found: {
    "title": "xena",
    "episode": 113,
    "episode_title": "coming home",
    "other": "NTSC",
    "source": "DVD",
    "audio_codec": "Dolby Digital",
    "audio_channels": "5.1",
    "container": "mkv",
    "mimetype": "video/x-matroska",
    "type": "episode"
}
(guessit) jmaguire@scorpion [06:22:42 PM] [~/repos/guessit/guessit] [develop]
-> % guessit 'Xena - [02x01] - Orphan of War.mkv'
For: Xena - [02x01] - Orphan of War.mkv
GuessIt found: {
    "title": "Xena",
    "season": 2,
    "episode": 1,
    "episode_title": "Orphan of War",
    "container": "mkv",
    "mimetype": "video/x-matroska",
    "type": "episode"
}

It detects the title and episode titles correctly. My best guess is that the absolute episode numbering is causing issues. I'm happy to dig into this issue further and attempt to develop a fix if pointed in the right direction. (Where should I start debugging?)

Medusa (please complete the following information):

p0psicles commented 5 years ago
  1. Medusa does some additional processing on top of guessit. You can test that using python report_guessit.py releasename. Can be found in tests folder.

  2. Absolute numbering is tied to anime currently. So you might want to mark the show as anime to test.

I'll see what's going on.

johnmaguire commented 5 years ago
root@205113b7a2a4:/app/medusa/tests$ python report_guessit.py "xena.e113.coming.home.ntsc.dvd.dd5.1.x264hi10-lightspeed.mkv"
# guessit: 3.0.3  rebulk: 1.0.0
? xena.e113.coming.home.ntsc.dvd.dd5.1.x264hi10-lightspeed.mkv
: title: xena
  episode: 113
  episode_title: coming home
  other: NTSC
  source: DVD
  audio_codec: AC3
  audio_channels: 5.1
  container: mkv
  type: episode
  parsing_time: 0.0934240818024
root@205113b7a2a4:/app/medusa/tests$ python report_guessit.py "Xena - [02x01] - Orphan of War.mkv"
# guessit: 3.0.3  rebulk: 1.0.0
? Xena - [02x01] - Orphan of War.mkv
: title: Xena
  season: 2
  episode: 1
  episode_title: Orphan of War
  container: mkv
  type: episode
  parsing_time: 0.0795841217041

Both are still reported as "Xena". One is caps one is lowercase. Thinking that the matching isn't exactly the issue here. Do you know what file contains the logic for trying to map these to shows? I imagine there is some fuzzy logic to the tune of (match name) ~= (show name) && (match episode name) in (show episode names) etc? I'd love to poke around in there to try to figure out why one matches and the other doesn't. Obviously, it is probably the absolute numbering.

Marking the show as Anime and attempting to post-process the directory again also did not help. Same warnings as before. Does marking it as Anime change anything else about the processing?

medariox commented 5 years ago

The problem is indeed the episode numbering. You can post-process Animes with classic numbering (s2e4) and absolute numbering (e122), because either the indexers or Xem (thexem.de) usually provide absolute numbering information. Regular shows, on the other hand, almost never have absolute numbering information for the shows, making a reliable post-processing of those impossible. Your best bet is to either rename the episode manually or avoid the release group. You can also add the absolute numbering information for the show on the indexer (and process the show as Anime), but that will surely take more time and effort.

p0psicles commented 5 years ago

@mrtimscampi had a good suggestion to separate the absolute numbering from the anime flag. You could try to poke around there. As imo there is not a really good reason to keep absolute numberingimited to anime shows.

p0psicles commented 5 years ago

What do you think @medariox?

johnmaguire commented 5 years ago

@MrTimscampi had a good suggestion to separate the absolute numbering from the anime flag. You could try to poke around there. As imo there is not a really good reason to keep absolute numberingimited to anime shows.

This is kind of what I was thinking. If the issue is indeed that Medusa is simply ignoring the absolute numbering, I'd be happy to add an "Absolute numbering" flag for TV Shows, separate from Anime (what else does the Anime flag do?)

You can also add the absolute numbering information for the show on the indexer (and process the show as Anime), but that will surely take more time and effort.

It sounds like in addition to Medusa ignoring the absolute numbering, the TV DB doesn't include absolute numbering (we know this because toggling Anime on didn't work, I assume?) I'm a fan of solving things generally (i.e. for others as well) so I'll look into whether I can send some API requests @ the indexer to set the absolute numbers as well.

edit: TVDb's API doesn't support updates :(

medariox commented 5 years ago

I don't think the Anime stuff is really that much bound as we think it is. If I'd have to start somewhere, I'd start here: https://github.com/pymedusa/Medusa/blob/918cfe7ce01808499585ffe6c1356fd5ed369f76/medusa/post_processor.py#L1031-L1032 As this is where we get the EpisodePostProcessingFailedException. From that we can conclude: https://github.com/pymedusa/Medusa/blob/918cfe7ce01808499585ffe6c1356fd5ed369f76/medusa/post_processor.py#L1029 self._find_info() is unable to return the series_obj. The question is: why? So from there we need to see what _find_info() is actually doing and so and so on until we get to the root of the behavior. That would be the start of the journey.