XVicarious / FlexAniDBSuite

Also known as FADBS; Complete workflow of Flexget plugins for AniDB usage
GNU General Public License v3.0
5 stars 0 forks source link

[Suggestion] Animes Split Season #2

Closed luizoti closed 3 years ago

luizoti commented 5 years ago

Some anime like Tokyo Ghoul-re have split seasons, and in some fansubs these seasons are termed "2nd Season", it may be necessary to create a mechanism to identify similar cases.

For example, if the title contains Tokyo Ghoul - re and 2nd Season the search will altomaticamente consider that it is the second season, in case the second season of Tokyo Ghoul - Re can be found as "Tokyo Ghoul: Re (2018)" in AniDB .

XVicarious commented 5 years ago

The Flexget series plugin has an exact setting, that matches titles exactly for this purpose. With my currently vanilla Flexget setup this hasn't been an issue. I think it even detected the separate series automatically. It probably has to do with how my folder is already organized for AniDB. Either that, or it detected the difference when I imported my AniDB wishlist.

I'll look into this possibility though. If you have logs showing me when and where this happens it would go a long way too.

luizoti commented 5 years ago

I think in this case Exec would not solve.

It's basically a matter related to Thetvdb and how fansubs name the anime, for example:

For fansubs and even anidb, the file below would be the second season of Tokyo Ghoul - re.

[Erai-raws] Tokyo Ghoul - re 2nd Season - 02 [1080p] .mkv

But when looking for TheTvDb, this same season would actually be the fourth season.

https://www.thetvdb.com/series/tokyo-ghoul

This makes the flexget go wrong in which season is taking the information.

Related to his plugin, I BELIEVE THAT, when looking for Tokyo Ghoul - re inside Anidb, he will identify the first season, since the second one is like Tokyo Ghoul - re (2018)

http://anidb.net/perl-bin/animedb.pl?adb.search=Tokyo+Ghoul+re&show=animelist&do.search=search

(this part about your plugin is speculation)

luizoti commented 5 years ago

I think in this case Exec would not solve.

It's basically a matter related to Thetvdb and how fansubs name the anime, for example:

For fansubs and even anidb, the file below would be the second season of Tokyo Ghoul - re.

[Erai-raws] Tokyo Ghoul - re 2nd Season - 02 [1080p] .mkv

But when looking for TheTvDb, this same season would actually be the fourth season.

https://www.thetvdb.com/series/tokyo-ghoul

This makes the flexget go wrong in which season is taking the information.

Related to his plugin, I BELIEVE THAT, when looking for Tokyo Ghoul - re inside Anidb, he will identify the first season, since the second one is like Tokyo Ghoul - re (2018)

http://anidb.net/perl-bin/animedb.pl?adb.search=Tokyo+Ghoul+re&show=animelist&do.search=search

(this part about your plugin is speculation)

XVicarious commented 5 years ago

Yeah. This plugin will search AniDB's series, not TheTVDB. If you would be willing to test the plugin on your setup that would be great. If that would be okay with you, I'll let you know when I'm wrapping up the v0.1 release.

luizoti commented 5 years ago

I'll try, can you tell me yes.

purposelycryptic commented 5 years ago

I realize this is an older issue at this point, but probably the easiest way to get a precise match for files would be to integrate something like this tiny ed2k-link Calculator, (it's just 45 lines of python), and query AniDB/local cache (AniDB uses ed2k-links as their unique identifier for individual files - all other hashes aren't uniqueness-enforced, so you may get multiple results). You could also add something like Omnihash, but really, the ed2k-hash is all that is needed.

That would also give you what you need to add the files to a user's MyList (using the UDP Command MYLISTADD size={int4 size}&ed2k={str ed2khash}), assuming the file has already been added to AniDB (reply = 210 or 310). If it isn't ((reply = 320), as with many just-released episodes people get using RSS, you would likely have to make a call to the CLI version of AVdump2, then repeat.

Just be careful not to get caught in flood-control and end up banned. I've done this many, many times, as AniDB has very conservative rate limits - as soon as you see your first 555 error, best to stop all communication with AniDB for ~10 minutes or so, since you never know if your user is connecting to AniDB in any other way that might contribute to ban-length. For example, I use an AniDB-based Plex metadata agent, HAMA, (which may be worth looking into for ideas, as it is also python-based) and sometimes it would refresh metadata right as I was testing... this is how I ended up with bans of 2-4 days+ several times before I drastically lowered my connection-rate.

The official UDP Rate Limit is defined as:

Flood Protection To prevent high server load the UDP API server enforces a strict flood protection policy.

Short Term: A Client MUST NOT send more than 0.5 packets per second (that's one packet every two seconds, not two packets a second!) The server will start to enforce the limit after the first 5 packets have been received. Long Term: A Client MUST NOT send more than one packet every four seconds over an extended amount of time. An extended amount of time is not defined. Use common sense.

But I was seeing bans using 2500ms short terms delay, 4500ms long term delay, with a switchover to long term delay after 1 hour of activity, and return to short term delay after 30 minutes of inactivity (Not that this project is likely to see that much continuous activity, but...). That mostly disappeared after I switched over to using 3000ms short terms delay, 5000ms long term delay, with a switchover to long term delay after 30 minutes of activity, and return to short term delay after 45 minutes of inactivity. For testing, I'd play it really safe, and use a 3500-4000ms short term, and 5500-6000ms long-term delay - at the upper limit, that adds an extra two seconds to the official rate limit, which should make for no bans. You can always try slowly lowering it later, once everything work, until you start seeing bans, then add 750-1000ms to those timings just to be safe.

Anyway, you mentioned this being an AniDB-only project (which sounds perfectly reasonable to me), but, and you probably know about this already, if you ever need to connect AniDB series with TVDB metadata, ScudLee's anime-lists exist to do just that.

Anyway, sorry for the long post, I've just been excited about your plugin for quite some time:-)

XVicarious commented 5 years ago

Yeah. I've been using this and the next thing I definitely have to do is mylist support. As a managed list plugin. So far I've been relying on the parser, but sometimes it works, sometimes it doesn't. I'm actually working on cleaning up what I have thus far to do an initial release.

Sent from ProtonMail mobile

-------- Original Message -------- On Apr 24, 2019, 5:44 PM, purposelycryptic wrote:

I realize this is an older issue at this point, but probably the easiest way to get a precise match for files would be to integrate something like this tiny ed2k-link Calculator, (it's just 45 lines of python), and query AniDB/local cache (AniDB uses ed2k-links as their unique identifier for individual files - all other hashes aren't uniqueness-enforced, so you may get multiple results). You could also add something like Omnihash, but really, the ed2k-hash is all that is needed.

That would also give you what you need to add the files to a user's MyList (using the UDP Command MYLISTADD size={int4 size}&ed2k={str ed2khash}), assuming the file has already been added to AniDB (reply = 210 or 310). If it isn't ((reply = 320), as with many just-released episodes people get using RSS, you would likely have to make a call to the CLI version of AVdump2, then repeat.

Just be careful not to get caught in flood-control and end up banned. I've done this many, many times, as AniDB has very conservative rate limits - as soon as you see your first 555 error, best to stop all communication with AniDB for ~10 minutes or so, since you never know if your user is connecting to AniDB in any other way that might contribute to ban-length. For example, I use an AniDB-based Plex metadata agent, HAMA, (which may be worth looking into for ideas, as it is also python-based) and sometimes it would refresh metadata right as I was testing... this is how I ended up with bans of 2-4 days+ several times before I drastically lowered my connection-rate.

The official UDP Rate Limit is defined as:

Flood Protection To prevent high server load the UDP API server enforces a strict flood protection policy.

Short Term: A Client MUST NOT send more than 0.5 packets per second (that's one packet every two seconds, not two packets a second!) The server will start to enforce the limit after the first 5 packets have been received. Long Term: A Client MUST NOT send more than one packet every four seconds over an extended amount of time. An extended amount of time is not defined. Use common sense.

But I was seeing bans using 2500ms short terms delay, 4500ms long term delay, with a switchover to long term delay after 1 hour of activity, and return to short term delay after 30 minutes of inactivity (Not that this project is likely to see that much continuous activity, but...). That mostly disappeared after I switched over to using 3000ms short terms delay, 5000ms long term delay, with a switchover to long term delay after 30 minutes of activity, and return to short term delay after 45 minutes of inactivity. For testing, I'd play it really safe, and use a 3500-4000ms short term, and 5500-6000ms long-term delay - at the upper limit, that adds an extra two seconds to the official rate limit, which should make for no bans. You can always try slowly lowering it later, once everything work, until you start seeing bans, then add 750-1000ms to those timings just to be safe.

Anyway, you mentioned this being an AniDB-only project (which sounds perfectly reasonable to me), but, and you probably know about this already, if you ever need to connect AniDB series with TVDB metadata, ScudLee's anime-lists exist to do just that.

Anyway, sorry for the long post, I've just been excited about your plugin for quite some time:-)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

purposelycryptic commented 5 years ago

Sounds great :-) Right now I'm somewhat under-utilizing FlexGet, mainly having it pull new releases from my ShanaProject RSS Feed and adding them to Deluge. Then I have the Execute plugin trigger a batch-script to run AVDumpCLI on any new anime files it hasn't seen before (it keeps a log of all the dumps it has performed by default), and the FileBotTool plugin do a temporary move-and-rename into my library. Since FileBot isn't all that reliable for anime series names, groups and episode numbers, I still end up using AniAdd to add the files to my MyList and properly rename them once AniDB has parsed all the dumped MediaInfo, so it's a lot of moving parts altogether, but it mostly works.

Your project seems like it has a lot of potential, and I look forward to the initial release :+1:

luizoti commented 5 years ago

The solution I found was using regular expressions and the 'manipulate' plugin, the fansub that down is Erai-raws, but from what I saw HorribleSubs uses the same names.

The expression extracts the relevant parts and the plugin structure in a way that flexget can identify the correct season for each ep, so I can also use the move plugin and they do the work of putting the files in the right folders and renaming, after moving the files the task runs the tinymediamenager as well.

Here is an example of part of my config:

tasks:
  ANIMES1:
    priority: 1
    manipulate:
      - title:
        # [Erai-raws] Yakusoku no Neverland - 12 END [1080p]
        # [Erai-raws] Yakusoku no Neverland - 12 [1080p]
          extract: '(?!.*(?:Season|S\d+|Movie|(.+?) (\d+) - (\d+)))\[(?:.+)?\] (.+?) - (\d+)(?:.*) \[((?:1080|720|480)p)\]'
      - title:    
          replace:            
            regexp: '(.*) (\d+) (\d*\w*)'
            format: '\1 - S01E\2 - \3'
    regexp:
      reject:
        # [Erai-raws] Boku no Hero Academia S2 - 00 [1080p]
        # [Erai-raws] Boku no Hero Academia S2 - 25 END [1080p]
        - '(?:\[.+?\]) (.+?) (?:S)(\d+) - (\d+)(?:.*) \[((?:1080|720|480)p)\]'
        # [Erai-raws] Shingeki no Kyojin Season 3 - 09 [1080p]
        # [Erai-raws] Shingeki no Kyojin Season 3 - 09 [1080p]
        # [Erai-raws] Shingeki no Kyojin Season 3 - 09 END [1080p]
        # [Erai-raws] Shingeki no Kyojin Season 3 - 09 END [1080p]        
        - '(?:\[.+?\]) (.+?) (?:Season) (\d+) - (\d+)(?:.*) \[((?:1080|720|480)p)\]'
        # [Erai-raws] Gyakuten Saiban - Sono Shinjitsu Igi Ari 2nd Season - 23 END [1080p]
        # [Erai-raws] Gyakuten Saiban - Sono Shinjitsu Igi Ari 2nd Season - 23 [1080p]
        # [Erai-raws] Gyakuten Saiban - Sono Shinjitsu Igi Ari 2nd Season - 23 [1080p]        
        - '\[(?:.+)?\] (.+?) (\d*)(?:nd|rd|th) (?:Season)\s- (\d+)\s(?:.*)\[((?:1080|720|480)p)\]'
        - '(?!.*(?:Season|S\d+|Movie|Lupin|Special|\~))(?:\[.+?\]) (.+?) (\d+) - (\d+)\s(?:.*)\[((?:1080|720|480)p)\]'
        - '(?!.*(?:Season|S\d+|Movie|Lupin|Special|\~))(?:\[.+?\]) (.+?) (II|III|IV|V|VI|VII|VIII|IX|X) - (\d+)\s(?:.*)\[((?:1080|720|480)p)\]'
        - '(?!.*(?:Season|S\d+|Movie|Special|\~))(?:\[.+?\]) (.+?) (II|III|IV|V|VI|VII|VIII|IX|X) - (?:.+?) - (\d+)\s(?:.*)\[((?:1080|720|480)p)\]'
    template:
      - ANIMES
      - push1

templates:
  ANIMES:
    require_field: 
      - trakt_series_name
      - trakt_season
    trakt_lookup: yes      
    metainfo_series: yes        
    rss:
      url: 'https://nyaa.si/?page=rss&q=Erai-raws+1080p+Multiple+Subtitle&c=0_0&f=0'
      # all_entries: no
    regexp:
      reject:
        - '(~|-|–) (\d*)(\s(~|-|–)\s|(~|-|–))(\d*)'
        - '(Vol|Volume)'
        - '~\s\s-'
    series:
      - Bermuda
      - Manaria Friends
      - Kaguya
      - Doukyonin wa Hiza
      - Tate no Yuusha no Nariagari                             
      - Boogiepop wa Warawanai
      - Girly Air Force
      - Karakuri Circus
      - Sora yori mo Tooi Basho
      - The Promised Neverland:
          alternate_name: Yakusoku no Neverland
      - Dororo
      # - JoJo no Kimyou 
      - Tensei shitara Slime Datta Ken
      - Akanesasu Shoujo
      - Release the Spyce
      - Zombieland Saga
      - Goblin Slayer
      - Radiant
      - Seishun Buta
      - Grand Blue
      - Hataraku Saibou
      - Banana Fish
      - Planet With
      - HANEBADO!
      - Boku no Hero Academia
      - Shingeki no Kyojin
      - Hunter x Hunter
      - SSSS.GRIDMAN                                                
      - Ueno                                                    
      - Domestic na Kanojo                                      
      - Rinshi                                     
      - Pastel Memories                                         
      - Kouya no Kotobuki Hikoutai                              
      - One Punch Man                                           
      - Isekai Quartet                                          
      - Kimetsu no Yaiba                                        
      - Kono Oto Tomar                                       
      - Fairy Gone                                              
      - Mayonaka no Occult Koumuin                              
      - Carole and Tuesday 
      - Sarazanmai
      - Hitoribocchi no Marumaru Seikatsu  
    transmission:
      host: localhost
      port: 9091
    set:
      path: '/media/luiz/2TB/.FLEXGET/SERIES_TEMP'
    if:
      - "'JoJo' in title":
          set:
            path: /media/luiz/2TB/MIDIA/SERIES/JoJo's Bizarre Adventure\Season 4     
      - "'Movie' in title":
          set:
            path: '/media/luiz/2TB/.FLEXGET/MOVIE_TEMP'