jrabbit / riotbot-plugins

Supybot plugin modifications or new plugins for a bot on freenode
0 stars 0 forks source link

Output fails on UTF8 title #1

Open jrabbit opened 11 years ago

jrabbit commented 11 years ago

I suspect this is because of two different encodings and trying to .format() them together is a bad idea.

This may be unrelated:


Exception in thread Thread #873 (for snarfing https://nebraskaworker.wordpress.com/2012/08/11/informal-work-groups-stan-weir/):
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 808, in __bootstrap_inner
    self.run()
  File "/usr/lib/pymodules/python2.7/supybot/commands.py", line 80, in run
    super(UrlSnarfThread, self).run()
  File "/usr/lib/python2.7/threading.py", line 761, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/pymodules/python2.7/supybot/commands.py", line 130, in doSnarf
    f(self, irc, msg, match, *L, **kwargs)
  File "/home/jack/Projects/riotbot/plugins/Web/plugin.py", line 99, in titleSnarfer
    parser.feed(text)
  File "/usr/lib/python2.7/HTMLParser.py", line 114, in feed
    self.goahead(0)
  File "/usr/lib/python2.7/HTMLParser.py", line 158, in goahead
    k = self.parse_starttag(i)
  File "/usr/lib/python2.7/HTMLParser.py", line 305, in parse_starttag
    attrvalue = self.unescape(attrvalue)
  File "/usr/lib/python2.7/HTMLParser.py", line 472, in unescape
    return re.sub(r"&(#?[xX]?(?:[0-9a-fA-F]+|\w{1,8}));", replaceEntities, s)
  File "/usr/lib/python2.7/re.py", line 151, in sub
    return _compile(pattern, flags).sub(repl, string, count)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xbb in position 0: ordinal not in range(128)
jrabbit commented 11 years ago

Actual example 48.326 <@jrabbit> http://www.submedia.tv/stimulator/2013/04/11/abolish-the-police/ 48.328 < riotbot> ミ★: Abolish the Police « subMedia