wagnerrp / pytmdb3

Python interface to TheMovieDB.org v3 API
BSD 3-Clause "New" or "Revised" License
122 stars 36 forks source link

Library broken with non-ascii country names in alternate titles #40

Open gazpachoking opened 10 years ago

gazpachoking commented 10 years ago

A traceback from our software when looking up a movie which has an alternate title with a non-ascii country name. The code calls str(country) which will fail for any non-ascii country name. If country codes are meant to be ascii only, the library should throw out the invalid result. If not, it should always be dealing with the result as unicode, and never as bytes.

The movie causing this traceback is Rocky (tt0075148).

Traceback (most recent call last):
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\flexget\task.py", line 420, in __run_plugin
    return method(*args, **kwargs)
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\flexget\event.py", line 21, in __call__
    return self.func(*args, **kwargs)
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\flexget\plugins\output\dump.py", line 83, in on_task_output
    dump(undecided, task.options.debug, eval_lazy, trace)
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\flexget\plugins\output\dump.py", line 35, in dump
    value = entry[field]
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\flexget\entry.py", line 270, in __getitem__
    return result()
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\flexget\entry.py", line 43, in __call__
    result = func(self.entry, self.field)
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\flexget\plugins\metainfo\tmdb_lookup.py", line 59, in lazy_loader
    imdb_id=imdb_id)
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\flexget\utils\database.py", line 25, in wrapper
    result = func(*args, **kwargs)
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\flexget\plugins\api_tmdb.py", line 289, in lookup
    ApiTmdb.get_movie_details(movie, session)
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\flexget\plugins\api_tmdb.py", line 331, in get_movie_details
    movie.update_from_object(result)
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\flexget\plugins\api_tmdb.py", line 121, in update_from_object
    if len(update_object.alternate_titles) > 0:
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\util.py", line 152, in __get__
    self.poller.__get__(inst, owner)()
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\util.py", line 80, in __call__
    self.apply(req.readJSON())
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\util.py", line 89, in apply
    setattr(self.inst, v, data[k])
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\util.py", line 223, in __set__
    data.sort()
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\tmdb_api.py", line 282, in __lt__
    return (self.country == self._locale.country) \
  File "C:\Users\chase.sterling\PycharmProjects\Flexget\lib\site-packages\tmdb3\locales.py", line 44, in __eq__
    return (id(self) == id(other)) or (str(self) == str(other))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-5: ordinal not in range(128)
wagnerrp commented 10 years ago

All country codes are supposed to be ISO 3166-1 codes, meaning two uppercase ASCII letters, so "Србија" is invalid data in the TMDB database. I'll add something in there to discard the entry with a warning, similar to invalid dates.

gazpachoking commented 10 years ago

Sounds good, thanks!