Goldenfreddy0703 / Otaku

Repository for Otaku Development
GNU General Public License v3.0
114 stars 22 forks source link

nyaa, database.tvdb_part/season #251

Closed theasguard closed 6 months ago

theasguard commented 6 months ago

let me know, how or if it works better. Changes:

These patterns should better match various formats used, such as:

and regex_ep_range This regex matches:

This regex is more flexible and should match most episode range formats used

last one is in the q1, q2 section This regular expression pattern |?.+?:([^|]+) matches the values between : and | (or the end of the string), and the re.findall function returns a list of these values, which we can then assign to q1 and q2.

Goldenfreddy0703 commented 6 months ago

Heyyy there, I'm just about to leave for a road trip but I will go ahead and test this real quick. Thank you very much. @Gujal00 if you don't mind, go ahead and test this as well and tell us what you think. Thanks

Goldenfreddy0703 commented 6 months ago

Hey so i did some comparing and testing and im not too sure if this improved or not so may want Gujal to see what he thinks and if did improve then we will merge.

Also i noticed that season code was in there, if it helps, here is some code to help get the season or part of an anime.

season= database.get_tvdb_season(anilist_id)
part = database.get_tvdb_part(anilist_id)    
theasguard commented 6 months ago

Hey so i did some comparing and testing and im not too sure if this improved or not so may want Gujal to see what he thinks and if did improve then we will merge.

Also i noticed that season code was in there, if it helps, here is some code to help get the season or part of an anime.

season= database.get_tvdb_season(anilist_id)
part = database.get_tvdb_part(anilist_id)    

ok thats no problem, sounds good. ill look into some of that code also, ive been writing some python lately, and asked the ai im working with right now if they could improve the regex for nyaa, so let me know if it actually improved it. i could see the odd difference but i do so many changes on my own system lol with the reversions to the torrent scraping i figured id see if i can help improve the torrents. ill see what we can do, i know odd issues with torrents but a ton to go over.

theasguard commented 6 months ago

season= database.get_tvdb_season(anilist_id) part = database.get_tvdb_part(anilist_id)

heres one way to go about it, this is a rough draft not tested, and i just picked a line. if youd like i can send you more code snippets on how it could be gone about.

def _get_episode_sources_pack(self, show, anilist_id, episode):
    query = '%s "Batch"|"Complete Series"' % show

    episodes = pickle.loads(database.get_show(anilist_id)['kodi_meta'])['episodes']
    if episodes:
        query += '|"01-{0}"|"01~{0}"|"01 - {0}"|"01 ~ {0}"'.format(episodes)

    season = database.get_tvdb_season(anilist_id)
    if season:
        query += '|"S{0}"|"Season {0}"'.format(season)

    part = database.get_tvdb_part(anilist_id)
    if part:
        query += '|"Part {0}"'.format(part)

url = '%s?f=0&c=1_2&q=%s&s=seeders&&o=desc' % (self._BASE_URL, urllib_parse.quote_plus(query))
return self._process_nyaa_backup(url, anilist_id, 2, episode.zfill(2), True)

Below the function will include the season_list information retrieved from the database in the search query, allowing for even more refined searches. this is all just on the spot untested

def _get_episode_sources_pack(self, show, anilist_id, episode):
    query = '%s "Batch"|"Complete Series"' % show

    episodes = pickle.loads(database.get_show(anilist_id)['kodi_meta'])['episodes']
    if episodes:
        query += '|"01-{0}"|"01~{0}"|"01 - {0}"|"01 ~ {0}"'.format(episodes)

    season = database.get_tvdb_season(anilist_id)
    if season:
        query += '|"S{0}"|"Season {0}"'.format(season)

    part = database.get_tvdb_part(anilist_id)
    if part:
        query += '|"Part {0}"'.format(part)

    season_list = database.get_season_list(anilist_id)
    if season_list:
        season_list = season_list['season']
        query += '|"%s"' % season_list

    url = '%s?f=0&c=1_2&q=%s&s=seeders&&o=desc' % (self._BASE_URL, urllib_parse.quote_plus(query))
    return self._process_nyaa_backup(url, anilist_id, 2, episode.zfill(2), True)
Goldenfreddy0703 commented 6 months ago

heres one way to go about it, this is a rough draft not tested, and i just picked a line. if youd like i can send you more code snippets on how it could be gone about.

def _get_episode_sources_pack(self, show, anilist_id, episode):
    query = '%s "Batch"|"Complete Series"' % show

    episodes = pickle.loads(database.get_show(anilist_id)['kodi_meta'])['episodes']
    if episodes:
        query += '|"01-{0}"|"01~{0}"|"01 - {0}"|"01 ~ {0}"'.format(episodes)

    season = database.get_tvdb_season(anilist_id)
    if season:
        query += '|"S{0}"|"Season {0}"'.format(season)

    part = database.get_tvdb_part(anilist_id)
    if part:
        query += '|"Part {0}"'.format(part)

url = '%s?f=0&c=1_2&q=%s&s=seeders&&o=desc' % (self._BASE_URL, urllib_parse.quote_plus(query))
return self._process_nyaa_backup(url, anilist_id, 2, episode.zfill(2), True)

Hey this looks good but for Part, you May need to add Cour in there as well as some animes use Cour sometimes. Btw sense Gujal has reviewed your code, I will go ahead and test and then I will merge very soon.

Thank you

theasguard commented 6 months ago

This code will append the season and part information retrieved from the database to the search query, allowing for more precise searching when looking for episode sources. the function will include the season_list information retrieved from the database in the search query, allowing for even more refined searches. the search query will also include "Cour" in addition to "Part" if the part information is available in the database.

Goldenfreddy0703 commented 6 months ago

Ok so from doing some testing, on multiple animes, we are getting 3 times more torrents from this pr so im gonna merge this but im also going to try to work on something today cause my consistent settings may need to be worked on more cause im noticing multiple episodes from scrapping so i may wanna look into that and see how i can match episode numbers with the nya scrapper.

image

theasguard commented 6 months ago

Ok so from doing some testing, on multiple animes, we are getting 3 times more torrents from this pr so im gonna merge this but im also going to try to work on something today cause my consistent settings may need to be worked on more cause im noticing multiple episodes from scrapping so i may wanna look into that and see how i can match episode numbers with the nya scrapper.

image

Sweet that's perfect and when I get time I'll also touch on your consistency code, see if there is a way to improve upon it. I'll keep in touch away from home for a few days so it'll be all on my phone haha.

theasguard commented 6 months ago

Here's what get_episode_sources_backup and the get_episode_sources looks like with tvdb_part and tvdb_season, wrote on my phone so it's not aligned for straight to the file. If anyone wants to write it in and test it that'd be great otherwise I'll align it later, thank you

def _get_episode_sources(self, show, anilist_id, episode, status, rescrape): if rescrape: return self._get_episode_sources_pack(show, anilist_id, episode)

try:
    cached_sources, zfill_int = database.getTorrentList(anilist_id)
    if cached_sources:
        return self._process_cached_sources(cached_sources, episode.zfill(zfill_int))
except ValueError:
    pass

query = '%s "- %s"' % (show, episode.zfill(2))
season = database.get_season_list(anilist_id)
if season:
    season = str(season['season']).zfill(2)
    query += '|"S%sE%s "' % (season, episode.zfill(2))

part = database.get_tvdb_part(anilist_id)
if part:
    query += '|"Part {0}"|"Cour {0}"'.format(part)

season_list = database.get_season_list(anilist_id)
if season_list:
    season_list = season_list['season']
    query += '|"%s"' % season_list

url = '%s?f=0&c=1_0&q=%s&s=downloads&o=desc' % (self._BASE_URL, urllib_parse.quote_plus(query))

if status == 'FINISHED':
    query = '%s "Batch"|"Complete Series"' % show
    episodes = pickle.loads(database.get_show(anilist_id)['kodi_meta'])['episodes']
    if episodes:
        query += '|"01-{0}"|"01~{0}"|"01 - {0}"|"01 ~ {0}"'.format(episodes)

    if season:
        query += '|"S{0}"|"Season {0}"'.format(season)
        query += '|"S%sE%s "' % (season, episode.zfill(2))

    if part:
        query += '|"Part {0}"|"Cour {0}"'.format(part)

    query += '|"- %s"' % (episode.zfill(2))
    url = '%s?f=0&c=1_0&q=%s&s=seeders&&o=desc' % (self._BASE_URL, urllib_parse.quote_plus(query))

return self._process_nyaa_episodes(url, episode.zfill(2), season)

def _get_episode_sources_backup(self, db_query, anilist_id, episode): show = self._get_request('https://kaito-title.firebaseio.com/%s.json' % anilist_id) show = json.loads(show)

if not show:
    return []

if 'general_title' in show:
    query = show['general_title'].encode('utf-8') if six.PY2 else show['general_title']
    _zfill = show.get('zfill', 2)
    episode = episode.zfill(_zfill)
    query = urllib_parse.quote_plus(query)
    url = '%s?f=0&c=1_0&q=%s&s=downloads&o=desc' % (self._BASE_URL, query)
    return self._process_nyaa_backup(url, anilist_id, _zfill, episode)

try:
    kodi_meta = pickle.loads(database.get_show(anilist_id)['kodi_meta'])
    kodi_meta['query'] = db_query + '|{}'.format(show['general_title'])
    database.update_kodi_meta(anilist_id, kodi_meta)
except:
    pass

query = '%s "- %s"|"Batch"|"Complete Series"' % (show.encode('utf-8') if six.PY2 else show, episode.zfill(2))

episodes = pickle.loads(database.get_show(anilist_id)['kodi_meta'])['episodes']
if episodes:
    query += '|"01-{0}"|"01~{0}"|"01 - {0}"|"01 ~ {0}"'.format(episodes)

season = database.get_tvdb_season(anilist_id)
if season:
    season = str(season['season']).zfill(2)
    query += '|"S%sE%s "' % (season, episode.zfill(2))

part = database.get_tvdb_part(anilist_id)
if part:
    query += '|"Part {0}"|"Cour {0}"'.format(part)

season_list = database.get_season_list(anilist_id)
if season_list:
    season_list = season_list['season']
    query += '|"%s"' % season_list

url = '%s?f=0&c=1_0&q=%s' % (self._BASE_URL, urllib_parse.quote_plus(query))

if status == 'FINISHED':
    query = '%s "Batch"|"Complete Series"' % show
    episodes = pickle.loads(database.get_show(anilist_id)['kodi_meta'])['episodes']
    if episodes:
        query += '|"01-{0}"|"01~{0}"|"01 - {0}"|"01 ~ {0}"'.format(episodes)

    if season:
        query += '|"S{0}"|"Season {0}"'.format(season)
        query += '|"S%sE%s "' % (season, episode.zfill(2))

    if part:
        query += '|"Part {0}"|"Cour {0}"'.format(part)

    query += '|"- %s"' % (episode.zfill(2))
    url = '%s?f=0&c=1_0&q=%s&s=seeders&&o=desc' % (self._BASE_URL, urllib_parse.quote_plus(query))

return self._process_nyaa_episodes(url, episode.zfill(2), season)

Can also add the tvdb season in as a rescrape like so

def _get_episode_sources(self, show, anilist_id, episode, status, rescrape): if rescrape: season = database.get_tvdb_season(anilist_id) if season: season = str(season['season']).zfill(2) return self._get_episode_sources_pack(show, anilist_id, episode, season)

try:
    cached_sources, zfill_int = database.getTorrentList(anilist_id)
    if cached_sources:
        return self._process_cached_sources(cached_sources, episode.zfill(zfill_int))
except ValueError:
    pass

query = '%s "- %s"' % (show, episode.zfill(2))
season = database.get_season_list(anilist_id)
if season:
    season = str(season['season']).zfill(2)
    query += '|"S%sE%s "' % (season, episode.zfill(2))

part = database.get_tvdb_part(anilist_id)
if part:
    query += '|"Part {0}"|"Cour {0}"'.format(part)

season_list = database.get_season_list(anilist_id)
if season_list:
    season_list = season_list['season']
    query += '|"%s"' % season_list

url = '%s?f=0&c=1_0&q=%s&s=downloads&o=desc' % (self._BASE_URL, urllib_parse.quote_plus(query))

if status == 'FINISHED':
    query = '%s "Batch"|"Complete Series"' % show
    episodes = pickle.loads(database.get_show(anilist_id)['kodi_meta'])['episodes']
    if episodes:
        query += '|"01-{0}"|"01~{0}"|"01 - {0}"|"01 ~ {0}"'.format(episodes)

    if season:
        query += '|"S{0}"|"Season {0}"'.format(season)
        query += '|"S%sE%s "' % (season, episode.zfill(2))

    if part:
        query += '|"Part {0}"|"Cour {0}"'.format(part)

    query += '|"- %s"' % (episode.zfill(2))
    url = '%s?f=0&c=1_0&q=%s&s=seeders&&o=desc' % (self._BASE_URL, urllib_parse.quote_plus(query))

return self._process_nyaa_episodes(url, episode.zfill(2), season)
Goldenfreddy0703 commented 6 months ago

Oh hey, about get_episode_sources_backup or nyaa backup, i think https://kaito-title.firebaseio.com/%s.json does not work anymore unfortunately, In the nyaa.py, we sometimes have code thats not even being used.

theasguard commented 6 months ago

Oh hey, about get_episode_sources_backup or nyaa backup, i think https://kaito-title.firebaseio.com/%s.json does not work anymore unfortunately, In the nyaa.py, we sometimes have code thats not even being used.

Ahh okay I'll go Through it some more I currently have the get_episode_sources implemented with the tvdb season/ part just not indented or tested as written on phone without a code pad.. Haha