pymedusa / Medusa

Automatic Video Library Manager for TV Shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic.
https://pymedusa.com
GNU General Public License v3.0
1.79k stars 276 forks source link

Guessit calling rebulk library with dependency on regex, which is broken #7743

Closed garnercx closed 4 years ago

garnercx commented 4 years ago

Describe the bug regex, a replacement for the re module is being called by rebulk which is used by guessit. 'regex' behaviour has recently changed and now breaks rebulk/guessit.

To Reproduce Steps to reproduce the behavior:

  1. Postprocess a file (either manually or wait for POSTPROCESSOR to run)
  2. If you have regex installed, POSTPROCESSOR will fail
  3. Error message is

POSTPROCESSOR :: Exception generated: An internal error has occured in guessit.

More logs below

Source

This bug report for guessit explains the issue.

This is the original bug report for regex

Troubleshooting

I tried the tactical fix is of altering rebulk to use re instead of regex with

`environ["REGEX_DISABLED"] = "1" # prevents rebulk from using regex package'

...and also tried setting the environment variable first export REGEX_DISABLED=1

Neither worked

Medusa

Debug logs (at least 50 lines):

2020-02-11 19:14:26 ERROR   Thread-10 :: [ee1b898] Exception generated: An internal error has occured in guessit.
===================== Guessit Exception Report =====================
version=3.1.0
string=/TV Shows/Comedy/Inside No. 9/Inside No. 9 - S02E01 - La Couchette - 720p BluRay - 2015_3_26
options={'allowed_languages': ['ca', 'cs', 'de', 'en', 'es', 'fr', 'he', 'hi', 'hu', 'it', 'ja', 'ko', 'mul', 'nl', 'no', 'pl', 'pt', 'ro', 'ru', 'sv', 'te', 'uk', 'und', 'jp'], 'expected_group': ['TV2LAX9', 'DHD', '20-40', 'E7'], 'expected_title': ['OSS 117', 'This is Us', 'Ulysses 31', 'Aqua Unit Patrol Squad 1', 'Inside No. 9', '10 Days That Unexpectedly Changed America', '500 Nations', '60 Minutes', '4 Corners', "James May's 20th Century", "Jamie's 30 Minute Meals", 'Formula 1: Drive to Survive'], 'allowed_countries': ['au', 'gb', 'us'], 'advanced_config': {'website': {'prefixes': ['from'], 'safe_tlds': ['com', 'net', 'org'], 'safe_prefixes': ['co', 'com', 'net', 'org'], 'safe_subdomains': ['www']}, 'container': {'info': ['nfo'], 'nzb': ['nzb'], 'subtitles': ['srt', 'idx', 'sub', 'ssa', 'ass'], 'videos': ['3g2', '3gp', '3gp2', 'asf', 'avi', 'divx', 'flv', 'iso', 'm4v', 'mk2', 'mk3d', 'mka', 'mkv', 'mov', 'mp4', 'mp4a', 'mpeg', 'mpg', 'ogg', 'ogm', 'ogv', 'qt', 'ra', 'ram', 'rm', 'ts', 'vob', 'wav', 'webm', 'wma', 'wmv'], 'torrent': ['torrent']}, 'audio_codec': {'audio_channels': {'7.1': ['7ch', '8ch', 're:(7[\\W_][01](?:ch)?)(?=[^\\d]|$)'], '5.1': ['5ch', '6ch', 're:(5[\\W_][01](?:ch)?)(?=[^\\d]|$)', 're:(6[\\W_]0(?:ch)?)(?=[^\\d]|$)'], '2.0': ['2ch', 'stereo', 're:(2[\\W_]0(?:ch)?)(?=[^\\d]|$)'], '1.0': ['1ch', 'mono']}}, 'language': {'language_prefixes': ['true'], 'subtitle_prefixes': ['st', 'vost', 'subforced', 'fansub', 'hardsub', 'legenda', 'legendas', 'legendado', 'subtitulado', 'soft', 'subtitles'], 'subtitle_affixes': ['sub', 'subs', 'esub', 'esubs', 'subbed', 'custom subbed', 'custom subs', 'custom sub', 'customsubbed', 'customsubs', 'customsub', 'soft subtitles', 'soft subs'], 'language_affixes': ['dublado', 'dubbed', 'dub'], 'synonyms': {'fra': ['fran\xc3\xa7ais', 'vf', 'vff', 'vfi', 'vfq'], 'ell': ['gr', 'greek'], 'jpn': ['jp'], 'por_BR': ['po', 'pb', 'pob', 'ptbr', 'br', 'brazilian'], 'hrv': ['scr'], 'swe': ['se'], 'ukr': ['ua'], 'cat': ['catal\xc3\xa0', 'castellano', 'espanol castellano', 'espa\xc3\xb1ol castellano'], 'ces': ['cz'], 'spa': ['esp', 'espa\xc3\xb1ol', 'espanol'], 'mul': ['multi', 'dl'], 'deu_CH': ['swissgerman', 'swiss german'], 'nld_BE': ['flemish'], 'zho': ['cn']}, 'weak_affixes': ['v', 'audio', 'true'], 'subtitle_suffixes': ['subforced', 'fansub', 'hardsub'], 'language_suffixes': ['audio']}, 'common_words': ['ca', 'cat', 'de', 'he', 'it', 'no', 'por', 'rum', 'se', 'st', 'sub'], 'screen_size': {'frame_rates': ['23.976', '24', '25', '29.970', '30', '48', '50', '60', '120'], 'max_ar': 1.898, 'min_ar': 1.333, 'progressive': ['360', '480', '540', '576', '900', '1080', '368', '720', '1440', '2160', '4320'], 'interlaced': ['360', '480', '576', '900', '1080']}, 'episodes': {'season_markers': ['s'], 'range_separators': ['-', '~', 'to', 'a'], 'episode_max_range': 100, 'episode_markers': ['xe', 'ex', 'ep', 'e', 'x'], 'season_max_range': 100, 'disc_markers': ['d'], 'season_words': ['season', 'saison', 'seizoen', 'seasons', 'saisons', 'tem', 'temp', 'temporada', 'temporadas', 'stagione'], 'max_range_gap': 1, 'discrete_separators': ['+', '&', 'and', 'et'], 'season_ep_markers': ['x'], 'of_words': ['of', 'sur'], 'all_words': ['All'], 'episode_words': ['episode', 'episodes', 'eps', 'ep', 'episodio', 'episodios', 'capitulo', 'capitulos']}, 'part': {'prefixes': ['pt', 'part']}, 'groups': {'starting': '([{', 'ending': ')]}'}, 'country': {'synonyms': {'BR': ['brazilian', 'bra'], 'CA': ['qu\xc3\xa9bec', 'quebec', 'qc'], 'MX': ['Latinoam\xc3\xa9rica', 'latin america'], 'ES': ['espa\xc3\xb1a'], 'GB': ['UK']}}, 'streaming_service': {'WatchMe': 'WME', 'TVING': 'TVING', 'ESPN': 'ESPN', 'Doc Club': 'DOCC', 'CNBC': 'CNBC', 'Syfy': 'SYFY', 'iTunes': 'iTunes', 'Netflix': ['NF', 'Netflix'], 'Spike': 'SPIK', 'Playstation Network': 'PSN', 'SBS (AU)': 'SBS', 'SeeSo': ['SESO', 'SeeSo'], 'AOL': 'AOL', 'GloboSat Play': 'GLOB', 'Hulu': 'HULU', 'Crunchy Roll': ['CR', 're:Crunchy-?Roll'], 'CuriosityStream': 'CUR', 'Nickelodeon': ['NICK', 'Nickelodeon'], "America's Test Kitchen": 'ATK', 'Digiturk Diledigin Yerde': 'DDY', 'DC Universe': 'DCU', 'Xbox Video': 'XBOX', 'BravoTV': 'BRAV', 'Canal+': 'CNLP', 'DIY Network': 'DIY', 'National Geographic': ['NATG', 're:National-?Geographic'], 'A&E': ['AE', 'A&E'], 'DPlay': 'DPLY', 'Crackle': 'CRKL', 'Freeform': 'FREE', 'Global': 'GLBL', 'Fox': 'FOX', 'IFC': 'IFC', 'ZDF': 'ZDF', 'Cinemax': 'CMAX', 'Adult Swim': ['AS', 're:Adult-?Swim'], 'Discovery': ['DISC', 'Discovery'], 'TubiTV': 'TUBI', 'AMC': 'AMC', 'Al Jazeera English': 'AJAZ', 'E!': 'ETV', 'Norsk Rikskringkasting': 'NRK', 'Comedians in Cars Getting Coffee': 'CCGC', 'SwearNet': 'SWER', 'Investigation Discovery': 'ID', 'PBS': 'PBS', 'ABC Australia': 'AUBC', 'UKTV': 'UKTV', 'NFL': 'NFL', 'NBA TV': ['NBA', 're:NBA-?TV'], 'AnimeLab': 'ANLB', 'Family Jr': 'FJR', 'CBC': 'CBC', 'Hallmark': 'HLMK', 'HBO Go': ['HBO', 're:HBO-?Go'], 'TBS': 'TBS', 'Motor Trend OnDemand': 'MTOD', 'NHL GameCenter': 'GC', 'Lifetime': 'LIFE', 'Daisuki': 'DSKI', 'CTV': 'CTV', 'MSNBC': 'MNBC', 'CHRGD': 'CHGD', 'Disney': ['DSNY', 'Disney'], 'Starz': 'STZ', 'Sveriges Television': 'SVT', 'Deadhouse Films': 'DHF', 'WWE Network': 'WWEN', 'Animal Planet': 'ANPL', 'Yahoo': 'YHOO', 'Country Music Television': 'CMT', 'Pluzz': 'PLUZ', 'Amazon Prime': ['AMZN', 'Amazon', 're:Amazon-?Prime'], 'TV Land': ['TVL', 're:TV-?Land'], 'Cartoon Network': 'CN', 'Stan': 'STAN', 'Comedy Central': ['CC', 're:Comedy-?Central'], 'PBS Kids': 'PBSK', 'Vimeo': 'VMEO', 'El Trece': 'ETTV', 'TLC': 'TLC', 'NBC': 'NBC', 'Sportsnet': 'SNET', 'USA Network': 'USAN', 'Velocity': 'VLCT', 'History': ['HIST', 'History'], 'Family': 'FAM', 'NFL Now': 'NFLN', 'Sprout': 'SPRT', 'UFC': 'UFC', 'FYI Network': 'FYI', 'Univision': 'UNIV', 'ITV': 'ITV', 'CSpan': 'CSPN', 'Channel 4': '4OD', 'CWSeed': 'CWS', 'VRV': 'VRV', 'The CW': ['CW', 're:The-?CW'], 'TV3 Ireland': 'TV3', 'RTE One': 'RTE', 'Esquire': 'ESQ', 'Viceland': 'VICE', 'CBS': 'CBS', 'Spike TV': ['SPKE', 're:Spike-?TV'], 'DramaFever': ['DF', 'DramaFever'], 'MBC': ['MBC', 'MBCVOD'], 'ePix': 'EPIX', 'ABC': 'AMBC', 'Shomi': 'SHMI', 'HGTV': 'HGTV', 'MTV': 'MTV', 'YouTube Red': 'RED', 'TV4 Sweeden': 'TV4', 'W Network': 'WNET', 'OnDemandKorea': ['ODK', 'OnDemandKorea'], 'Food Network': 'FOOD', 'Knowledge Network': 'KNOW', 'VH1': 'VH1', 'TFou': 'TFOU', 'Viki': 'VIKI', 'ARD': 'ARD', 'BBC iPlayer': ['iP', 're:BBC-?iPlayer']}, 'release_group': {'forbidden_names': ['bonus', 'by', 'for', 'par', 'pour', 'rip'], 'ignored_seps': '[]{}()'}}, 'episode_prefer_number': False, 'show_type': 'normal', 'type': 'episode', 'implicit': True}
--------------------------------------------------------------------
Traceback (most recent call last):
  File "/Applications/Medusa/ext/guessit/api.py", line 210, in guessit
    matches = self.rebulk.matches(string, options)
  File "/Applications/Medusa/ext/rebulk/rebulk.py", line 115, in matches
  File "/Applications/Medusa/ext/rebulk/rebulk.py", line 146, in _execute_rules
    rules = self.effective_rules(context)
  File "/Applications/Medusa/ext/rebulk/rules.py", line 316, in execute_all_rules
    when_response = execute_rule(rule, matches, context)
  File "/Applications/Medusa/ext/rebulk/rules.py", line 341, in execute_rule
    rule.then(matches, when_response, context)
  File "/Applications/Medusa/ext/rebulk/rules.py", line 122, in then
    cons.then(matches, next(iterator), context)
  File "/Applications/Medusa/ext/rebulk/rules.py", line 140, in then
    matches.remove(match)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/_abcoll.py", line 686, in remove
    del self[self.index(value)]
  File "/Applications/Medusa/ext/rebulk/match.py", line 569, in __delitem__
    self._remove_match(match)
  File "/Applications/Medusa/ext/rebulk/match.py", line 137, in _remove_match
    _BaseMatches._base_remove(self._tag_dict[tag], match)
ValueError: list.remove(x): x not in list
--------------------------------------------------------------------
Please report at https://github.com/guessit-io/guessit/issues.
====================================================================
Traceback (most recent call last):
  File "/Applications/Medusa/medusa/**********/web/core/base.py", line 251, in async_call
    result = function(**kwargs)
  File "/Applications/Medusa/medusa/**********/web/home/post_process.py", line 61, in processEpisode
    ignore_subs=argToBool(ignore_subs)
  File "/Applications/Medusa/medusa/process_tv.py", line 65, in run
    return ProcessResult(path, process_method).process(force=force, **kwargs)
  File "/Applications/Medusa/medusa/process_tv.py", line 188, in process
    ignore_subs=ignore_subs)
  File "/Applications/Medusa/medusa/process_tv.py", line 378, in process_files
    self.process_media(path, self.video_files, force, is_priority, ignore_subs)
  File "/Applications/Medusa/medusa/process_tv.py", line 599, in process_media
    self.result = processor.process()
  File "/Applications/Medusa/medusa/post_processor.py", line 1275, in process
    notifiers.notify_download(ep_obj)
  File "/Applications/Medusa/medusa/notifiers/__init__.py", line 103, in notify_download
    n.notify_download(ep_obj)
  File "/Applications/Medusa/medusa/notifiers/prowl.py", line 46, in notify_download
    ep_name = ep_obj.pretty_name_with_quality()
  File "/Applications/Medusa/medusa/tv/episode.py", line 1427, in pretty_name_with_quality
    return self._format_pattern('%SN - %Sx%0E - %EN - %QN')
  File "/Applications/Medusa/medusa/tv/episode.py", line 1631, in _format_pattern
    replace_map = self.__replace_map()
  File "/Applications/Medusa/medusa/tv/episode.py", line 1517, in __replace_map
    rel_grp['location'] = release_group(self.series, self.location)
  File "/Applications/Medusa/medusa/tv/episode.py", line 1497, in release_group
    parse_result = NameParser(series=series, naming_pattern=True).parse(name)
  File "/Applications/Medusa/medusa/name_parser/parser.py", line 388, in parse
    result = self._parse_string(name)
  File "/Applications/Medusa/medusa/name_parser/parser.py", line 292, in _parse_string
    guess = guessit.guessit(name, dict(show_type=self.show_type))
  File "/Applications/Medusa/medusa/name_parser/guessit_parser.py", line 80, in guessit
    result = default_api.guessit(name, options=final_options)
  File "/Applications/Medusa/ext/guessit/api.py", line 222, in guessit
    raise GuessitException(string, options)
GuessitException: An internal error has occured in guessit.
medariox commented 4 years ago

The regex module is optional and shouldn't be installed. Just remove it. Also this needs to be reported to the rebulk devs.