guessit-io / guessit

GuessIt is a python library that extracts as much information as possible from a video filename.
https://guessit-io.github.io/guessit
GNU Lesser General Public License v3.0
822 stars 91 forks source link

Not working with latest 'regex' release #633

Closed gazpachoking closed 4 years ago

gazpachoking commented 4 years ago

Seems there is an issue when the lastest release of the 'regex' module is installed. There is a ticket on that repo about the change as well. It would also be nice if there was a way to disable the use of that module other than setting the env variable.

https://bitbucket.org/mrabarnett/mrab-regex/issues/357/new-exception-valueerror-unused-keyword

Here's a traceback of what's happening:

flexget/tests/test_movieparser.py:3: in <module>
    from flexget.components.parsing.parsers.parser_guessit import ParserGuessit
flexget/components/parsing/parsers/parser_guessit.py:46: in <module>
    guessit_api.configure(options={}, rules_builder=rules_builder, force=True)
/opt/hostedtoolcache/Python/3.6.9/x64/lib/python3.6/site-packages/guessit/api.py:163: in configure
    self.rebulk = rules_builder(advanced_config)
flexget/components/parsing/parsers/parser_guessit.py:40: in rules_builder
    rebulk = rebulk_builder(config)
/opt/hostedtoolcache/Python/3.6.9/x64/lib/python3.6/site-packages/guessit/rules/__init__.py:56: in rebulk_builder
    rebulk.rebulk(episodes(_config('episodes')))
/opt/hostedtoolcache/Python/3.6.9/x64/lib/python3.6/site-packages/guessit/rules/properties/episodes.py:175: in episodes
    build_or_pattern(episode_markers + disc_markers, name='episodeMarker') + r'@?(?P<episode>\d+)')\
/opt/hostedtoolcache/Python/3.6.9/x64/lib/python3.6/site-packages/rebulk/builder.py:179: in regex
    return self.pattern(self.build_re(*pattern, **kwargs))
/opt/hostedtoolcache/Python/3.6.9/x64/lib/python3.6/site-packages/rebulk/builder.py:107: in build_re
    return RePattern(*pattern, **kwargs)
/opt/hostedtoolcache/Python/3.6.9/x64/lib/python3.6/site-packages/rebulk/pattern.py:441: in __init__
    pattern = call(re.compile, pattern, **self._kwargs)
/opt/hostedtoolcache/Python/3.6.9/x64/lib/python3.6/site-packages/rebulk/loose.py:60: in call
    return function(*call_args, **call_kwargs)
/opt/hostedtoolcache/Python/3.6.9/x64/lib/python3.6/site-packages/regex/regex.py:347: in compile
    return _compile(pattern, flags, kwargs)
/opt/hostedtoolcache/Python/3.6.9/x64/lib/python3.6/site-packages/regex/regex.py:584: in _compile
    raise ValueError('unused keyword argument {!a}'.format(any_one))
E   ValueError: unused keyword argument 'private_names'
jkwill87 commented 4 years ago

Yea, this is a problem inherited from rebulk which uses regex as a drop-in replacement for the built in re package if it is installed. Although not a requirement for guessit, regex is installed by many popular packages, e.g. black. As a result, due to a recent API change in regex, guessit's use of rebulk will break if regex is installed.

Heres is another exception report:

===================== Guessit Exception Report =====================
version=3.1.0
string=/Users/jkwill87/Sync/Development/mnamer/demo/Ninja Turtles (1990).mkv
options={'type': None}
--------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/jkwill87/Sync/Development/mnamer/venv/lib/python3.8/site-packages/guessit/api.py", line 192, in guessit
    config = self.configure(options, sanitize_options=False)
  File "/Users/jkwill87/Sync/Development/mnamer/venv/lib/python3.8/site-packages/guessit/api.py", line 163, in configure
    self.rebulk = rules_builder(advanced_config)
  File "/Users/jkwill87/Sync/Development/mnamer/venv/lib/python3.8/site-packages/guessit/rules/__init__.py", line 56, in rebulk_builder
    rebulk.rebulk(episodes(_config('episodes')))
  File "/Users/jkwill87/Sync/Development/mnamer/venv/lib/python3.8/site-packages/guessit/rules/properties/episodes.py", line 168, in episodes
    rebulk.chain(
  File "/Users/jkwill87/Sync/Development/mnamer/venv/lib/python3.8/site-packages/rebulk/builder.py", line 179, in regex
    return self.pattern(self.build_re(*pattern, **kwargs))
  File "/Users/jkwill87/Sync/Development/mnamer/venv/lib/python3.8/site-packages/rebulk/builder.py", line 107, in build_re
    return RePattern(*pattern, **kwargs)
  File "/Users/jkwill87/Sync/Development/mnamer/venv/lib/python3.8/site-packages/rebulk/pattern.py", line 441, in __init__
    pattern = call(re.compile, pattern, **self._kwargs)
  File "/Users/jkwill87/Sync/Development/mnamer/venv/lib/python3.8/site-packages/rebulk/loose.py", line 60, in call
    return function(*call_args, **call_kwargs)
  File "/Users/jkwill87/Sync/Development/mnamer/venv/lib/python3.8/site-packages/regex/regex.py", line 348, in compile
    return _compile(pattern, flags, ignore_unused, kwargs)
  File "/Users/jkwill87/Sync/Development/mnamer/venv/lib/python3.8/site-packages/regex/regex.py", line 585, in _compile
    raise ValueError('unused keyword argument {!a}'.format(any_one))
ValueError: unused keyword argument 'tags'
--------------------------------------------------------------------
Please report at https://github.com/guessit-io/guessit/issues.
====================================================================

Ideal upstream fixes would be for rebulk to offer another option than an environment variable to disable using the regex package or to use the ignore_unused kwarg when calling re.compile.

As a quick fix you can add from os import environ; environ["REGEX_DISABLED"] = "1" before importing guessit. For example, as done here.

gazpachoking commented 4 years ago

I'd prefer the ultimate answer to not be an environment variable. Having to set that from the importing package before the import happens isn't a very nice interface. (But at the moment, we are also setting that env variable.)

JohnVillalovos commented 4 years ago

So what is the environment variable for those of us who would like to work around this issue?

Never mind, I see:

REGEX_DISABLED=1
Toilal commented 4 years ago

Fixed with rebulk 2.0.1, just released.