pymedusa / Medusa

Automatic Video Library Manager for TV Shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic.
https://pymedusa.com
GNU General Public License v3.0
1.8k stars 276 forks source link

NyaaTorrent does not give back results when using non-zero padded episode numbers #1978

Closed p0psicles closed 7 years ago

p0psicles commented 7 years ago

Before submitting your issue:

Enable debug logging in Medusa settings, reproduce the error (be sure to disable after the bug is fixed)

Branch/Commit: Develop OS: windows/linux What you did: Searched for show 'Youjo Senki' episode 2. What happened: NyaaTorrent gave back no results. What you expected: Around 5 results. Logs:

PASTE LOGS HERE

It seems that when searching nyaa, the show + non padded absolute ep number is used. Meaning for Youjo Senki, is searches using the string: 'Youjo+Senki+2'. All releases for Youjo Senki are released with a padded episode like: Youjo Senki Saga of Tanya the Evil - 01 [720p][AAC][07B357FF].mkv. There are no wildcards available that we can use for this.

IDerr commented 7 years ago

Hmm, it's searching with 02 for ao no exorcist and it's getting snatched, it's weird... And all releases for Ao no exorcist are padded...

p0psicles commented 7 years ago

Yes that's by design. As I made an exception for season exceptions. So if it's a seasion exception it now uses 02. If it's not, it will use absolute number, and now that is '2'. But nyaa doesn't like that.

IDerr commented 7 years ago

Yeah, not really Nyaa, but it's like a convention to name episodes like that.

p0psicles commented 7 years ago

Sickrage has recently also made the change to double digit. Well they moved to three digit before that, don't why youd want that. but sure. I think we should do the same.

IDerr commented 7 years ago

In fact, the real way to name it, it's with 3 digits, but no one respect that.

IDerr commented 7 years ago

It's more like : Showname 01-09 Showname 10-999

p0psicles commented 7 years ago

They did some more canges like first try double digit, then fallback to three digit padding. https://github.com/SickRage/SickRage/commit/9b6c552e9fb496d6df5ad741a4a6e1001d164739 Apparently some providers require that, as some shows release in 001, 002, 003.

Yeah, i'll implement the same. I think it's a pretty save change.

IDerr commented 7 years ago

I think that try both (2 and 3 digits) like them is a good idea.

h3llrais3r commented 7 years ago

They did some more canges like first try double digit, then fallback to three digit padding. SickRage/SickRage@9b6c552

Yes I implemented this a few days ago. 😄 The problem is that most series (with less than 100 episodes) are labeled with 2 digits. (Cfr HorribleSubs releases). But in order to fiill keep the official absolute numbering, I implemented the fallback to 2 digits in case nothing is found with 3 digits. EDIT: of course I mean absolute numbering.

duramato commented 7 years ago

Scene rules for anime are for S01E01 not absolute numbering, that's what fansub groups made up.

p0psicles commented 7 years ago

Hi h3llrais3r, yes thanks for that! I was already playing with the idea, as I recently improved the episode numbers used for anime season scene names. Which I also started to use double digits for.

h3llrais3r commented 7 years ago

No problem. I found it out this weekend that it wasn't working like expected and tried to fix it for as many scenario's as possible. However, anime naming is not always very consistent, so I think it's difficult to have it working in each scenario. I recently discovered another scenario: I tried to download the 720p versions of 'JoJo's Bizarre Adventure (2012)', but it always fails the first time. When I run the search again, it always finds it. The problem is located at https://github.com/SickRage/SickRage/blob/master/sickrage/providers/GenericProvider.py#L211-L212 , so probably in your code at https://github.com/pymedusa/Medusa/blob/master/medusa/providers/generic_provider.py#L248-L249 as far as I can see. The parsed season number does not match the scene number... The second time it finds it, because it retrieves the results from the cache... I'm not sure if we can do anything about it? In case you could find it, please let me know 😄

p0psicles commented 7 years ago

I'll try it with this branch. We've made allot of improvements since we separated from sickrage. One of them is using guessit for parsing. Although we still need to refine that for anime parsing.

p0psicles commented 7 years ago

Is it a specific season that you had trouble searching for? Seems to work fine here. image

h3llrais3r commented 7 years ago

Season 2 and 3. For season one there was a batch. Checking out your branch now... 😄

p0psicles commented 7 years ago

I'll not be available for a while. Let me know if something is not working as expected.

h3llrais3r commented 7 years ago

I think you have the same problem (but a little different). Please retry with an empty cache. I did a checkout of develop branch, configured only nyatorrents as provider and added 'JoJo's Bizarre Adventure (2012)' as anime for 720p quality. First forced search of an episode (I took the last aired) gives no results (but it puts all results in cache). If you then do a new search again (same episode or other episode of the same season) you get a match (from cache) and it's downloaded. So it's a bit different, but probably the same issue (I think you are handling the cache a bit different) located at the same location (see previous comments).

p0psicles commented 7 years ago

That's not really a reliable test. Because with removing the cache, you also remove all your collected scene_exception. It will need to get these from thexem.de again. So trying without those, it's not finding any.

h3llrais3r commented 7 years ago

I mean I started completely from scratch. So with no db and cache, added the anime and did the search. Adding the anime normally gets the scene_exceptions from thexem.de. So I mean with no episodes in cache from previous searches.

h3llrais3r commented 7 years ago

Debug log added. debug-log.txt

p0psicles commented 7 years ago

The problem comes from thetvdb using 3 season, where the scene uses 4. Medusa is losing track somewhere while parsing it. As it's parsing it back to "indexer" season 4, while it should parse it to indexer season 3. This will take me some more time, to fully grasp. But now that @ratoaq2 is back ;-), we'll have experts on searching, parsing and indexers.

p0psicles commented 7 years ago

Never mind i'm a dumb ass, i didn't have scene numbering enabled. That's why it wasn't using xem mapping.

p0psicles commented 7 years ago

Back to the test. You shouldn't erase cache.db and test. Because by the time your doing your first search, chances are high, the cache.db's scene_exceptions table hasn't been populated with the season scene names. And it needs to that know that for ex. 'JoJo Bizarre Adventure - Diamond is Unbreakable' matches season 4 of xem mapping and season 3 of tvdb indexer.

What you could do is, try to access the cache.db table, and perform a delete from 'nyaatorrents' where indexerid = 262954;.

And then again perform a manual search. Then clean again and try a forced search. There shouldn't be much difference between those two. Only the benifit of manual search is, you will see all results.

I tried this myself, and I got results first try for: Season 3 Episode 38

p0psicles commented 7 years ago

Good news. I was able to reproduce it. You are absolutely right. Doing a forced search, first time it does not snatch it. Running it anther time does snatch it.

I'll fix it tomorrow.

p0psicles commented 7 years ago

@h3llrais3r , please take a look at this branch: #1997. I think I fixed it. But I needed to touch some old code. So will need to do some testing first.

h3llrais3r commented 7 years ago

I'll have a look at it right right away. I'll keep you posted. 👍

p0psicles commented 7 years ago

I fixed that issue but caused another bug. I'm currently refactoring that method. Will update it later today.

h3llrais3r commented 7 years ago

I also checked a bit more in detail and it seems to have a mismatch when compairing at https://github.com/pymedusa/Medusa/blob/master/medusa/providers/generic_provider.py#L246-L251

If we have anime and scene_numbering: the parsed result still returns the indexer season and number. In our case this is f.e. season 3 episode 39 (last aired episode). However, according to the scene numbering, this is season 4 episode 39 and it's ignored during first scan. However, the episode is stored in the cache as season 3 episode 39. This results in the next scan to find a direct match from the cache. So I see 2 solutions:

  1. Fix the compare at https://github.com/pymedusa/Medusa/blob/master/medusa/providers/generic_provider.py#L246-L251 (I'm just wondering: shouldn't we just compare the absolute numbering when is_scene is true, instead of comparing scene_episode and scene_number with parsed_episode and parsed_number?)
  2. separate cache from match and do match after everything is added to the cache.
p0psicles commented 7 years ago

I'll have to check that again. I looked at that yesterday, and imo that looked alright. You have to keep in mind, that eventually we want to use the scene season/ep values to search, and map these eventually back to the indexers season/ep. So in the cache tables we'd like to only have the indexers season/ep. In this case season 3 episode 39. For Jojo's there shouldn't be any season 4 references in the cache.db.