iclab / centinel

http://iclab.org/
MIT License
34 stars 17 forks source link

BeautifulSoup-related exception in HTTP request #255

Closed rpanah closed 7 years ago

rpanah commented 8 years ago
Exception in thread Thread-1010:
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/Users/abbas/iclab/centinel/centinel/primitives/http.py", line 106, in get_request
    meta_redirect_url = meta_redirect(first_response["response"]["body"])
  File "/Users/abbas/iclab/centinel/centinel/primitives/http.py", line 20, in meta_redirect
    soup = BeautifulSoup.BeautifulSoup(content)
  File "/Library/Python/2.7/site-packages/BeautifulSoup.py", line 1522, in __init__
    BeautifulStoneSoup.__init__(self, *args, **kwargs)
  File "/Library/Python/2.7/site-packages/BeautifulSoup.py", line 1147, in __init__
    self._feed(isHTML=isHTML)
  File "/Library/Python/2.7/site-packages/BeautifulSoup.py", line 1189, in _feed
    SGMLParser.feed(self, markup)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/sgmllib.py", line 104, in feed
    self.goahead(0)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/sgmllib.py", line 138, in goahead
    k = self.parse_starttag(i)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/sgmllib.py", line 296, in parse_starttag
    self.finish_starttag(tag, attrs)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/sgmllib.py", line 338, in finish_starttag
    self.unknown_starttag(tag, attrs)
  File "/Library/Python/2.7/site-packages/BeautifulSoup.py", line 1347, in unknown_starttag
    tag = Tag(self, name, attrs, self.currentTag, self.previous)
  File "/Library/Python/2.7/site-packages/BeautifulSoup.py", line 562, in __init__
    self.attrs = map(convert, self.attrs)
  File "/Library/Python/2.7/site-packages/BeautifulSoup.py", line 561, in <lambda>
    val))
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 155, in sub
    return _compile(pattern, flags).sub(repl, string, count)
  File "/Library/Python/2.7/site-packages/BeautifulSoup.py", line 528, in _convertEntities
    return unichr(int(x[1:]))
ValueError: unichr() arg not in range(0x10000) (narrow Python build)
rpanah commented 7 years ago

Looks like this is an encoding-related issue.