scrapy / scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.
https://scrapy.org
BSD 3-Clause "New" or "Revised" License
52.84k stars 10.53k forks source link

Improve exception logging while using telnet #3853

Open ddebernardy opened 5 years ago

ddebernardy commented 5 years ago

I had two scary looking Unhandled Error messages in my logs (see below), which after investigation seem to be related to stuff I did while using telnet to check on my crawler.

The first stack trace is likely due to a failed login attempt. The second is likely due to me logging out of telnet (I can't remember if I used exit() or ^C or something else).

I'll know to simply ignore this unhandled error in the future when I've been playing with telnet. Still, Scrapy could try to be a tiny bit more helpful for new users in this specific case.

(Or you could ignore this entirely, other new users will google 'unhandled error scrapy' and find this.)

Unhandled Error
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/twisted/python/log.py", line 103, in callWithLogger
    return callWithContext({"system": lp}, func, *args, **kw)
  File "/usr/local/lib/python3.7/site-packages/twisted/python/log.py", line 86, in callWithContext
    return context.call({ILogContext: newCtx}, func, *args, **kw)
  File "/usr/local/lib/python3.7/site-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/usr/local/lib/python3.7/site-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
--- <exception caught here> ---
  File "/usr/local/lib/python3.7/site-packages/twisted/internet/selectreactor.py", line 149, in _doReadOrWrite
    why = getattr(selectable, method)()
  File "/usr/local/lib/python3.7/site-packages/twisted/internet/tcp.py", line 243, in doRead
    return self._dataReceived(data)
  File "/usr/local/lib/python3.7/site-packages/twisted/internet/tcp.py", line 249, in _dataReceived
    rval = self.protocol.dataReceived(data)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/telnet.py", line 636, in dataReceived
    self.applicationDataReceived(b''.join(appDataBuffer))
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/telnet.py", line 988, in applicationDataReceived
    self.protocol.dataReceived(data)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/telnet.py", line 1035, in dataReceived
    self.protocol.dataReceived(data)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/insults/insults.py", line 537, in dataReceived
    self.terminalProtocol.keystrokeReceived(ch, None)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/recvline.py", line 225, in keystrokeReceived
    m()
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/recvline.py", line 374, in handle_RETURN
    return RecvLine.handle_RETURN(self)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/recvline.py", line 292, in handle_RETURN
    self.lineReceived(line)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 267, in lineReceived
    more = self.interpreter.push(line)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 106, in push
    more = self.runsource(source, self.filename)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/code.py", line 74, in runsource
    self.runcode(code)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 117, in runcode
    code.InteractiveInterpreter.runcode(self, *a, **kw)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<console>", line 1, in <module>

  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/_sitebuiltins.py", line 26, in __call__
    raise SystemExit(code)
builtins.SystemExit: None

2019-07-04 05:06:45 [twisted] CRITICAL: Unhandled Error
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/twisted/python/log.py", line 103, in callWithLogger
    return callWithContext({"system": lp}, func, *args, **kw)
  File "/usr/local/lib/python3.7/site-packages/twisted/python/log.py", line 86, in callWithContext
    return context.call({ILogContext: newCtx}, func, *args, **kw)
  File "/usr/local/lib/python3.7/site-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/usr/local/lib/python3.7/site-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
--- <exception caught here> ---
  File "/usr/local/lib/python3.7/site-packages/twisted/internet/selectreactor.py", line 149, in _doReadOrWrite
    why = getattr(selectable, method)()
  File "/usr/local/lib/python3.7/site-packages/twisted/internet/tcp.py", line 243, in doRead
    return self._dataReceived(data)
  File "/usr/local/lib/python3.7/site-packages/twisted/internet/tcp.py", line 249, in _dataReceived
    rval = self.protocol.dataReceived(data)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/telnet.py", line 636, in dataReceived
    self.applicationDataReceived(b''.join(appDataBuffer))
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/telnet.py", line 988, in applicationDataReceived
    self.protocol.dataReceived(data)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/telnet.py", line 1035, in dataReceived
    self.protocol.dataReceived(data)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/insults/insults.py", line 537, in dataReceived
    self.terminalProtocol.keystrokeReceived(ch, None)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/recvline.py", line 225, in keystrokeReceived
    m()
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/recvline.py", line 374, in handle_RETURN
    return RecvLine.handle_RETURN(self)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/recvline.py", line 292, in handle_RETURN
    self.lineReceived(line)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 267, in lineReceived
    more = self.interpreter.push(line)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 106, in push
    more = self.runsource(source, self.filename)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/code.py", line 74, in runsource
    self.runcode(code)
  File "/usr/local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 117, in runcode
    code.InteractiveInterpreter.runcode(self, *a, **kw)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<console>", line 1, in <module>

  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/_sitebuiltins.py", line 26, in __call__
    raise SystemExit(code)
builtins.SystemExit: None
Gallaecio commented 5 years ago

@ddebernardy Do you think you could share a simple set of steps to reproduce the issue?

ddebernardy commented 5 years ago

@Gallaecio I'm not 100% sure but I suspect the original post has the relevant details. While a lengthy crawl is ongoing, use telnet, fail to login (likely the first error), and then either trigger a syntax error (possibly the first error if it wasn't login-related), and exit (try with exit(), Ctrl+C, Ctrl+D).

sbs2001 commented 5 years ago

@Gallaecio failing to login has nothing to do with the error. exit() is generating the error @ddebernardy has mentioned about. Also Ctrl+Z is generating an error which is different than the one mentioned by @ddebernardy .

The errors:

exit()
Unhandled Error
Traceback (most recent call last):
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/log.py", line 103, in callWithLogger
    return callWithContext({"system": lp}, func, *args, **kw)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/log.py", line 86, in callWithContext
    return context.call({ILogContext: newCtx}, func, *args, **kw)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
--- <exception caught here> ---
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/internet/posixbase.py", line 614, in _doReadOrWrite
    why = selectable.doRead()
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/internet/tcp.py", line 243, in doRead
    return self._dataReceived(data)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/internet/tcp.py", line 249, in _dataReceived
    rval = self.protocol.dataReceived(data)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/telnet.py", line 636, in dataReceived
    self.applicationDataReceived(b''.join(appDataBuffer))
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/telnet.py", line 988, in applicationDataReceived
    self.protocol.dataReceived(data)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/telnet.py", line 1035, in dataReceived
    self.protocol.dataReceived(data)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/insults/insults.py", line 537, in dataReceived
    self.terminalProtocol.keystrokeReceived(ch, None)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/recvline.py", line 225, in keystrokeReceived
    m()
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/recvline.py", line 374, in handle_RETURN
    return RecvLine.handle_RETURN(self)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/recvline.py", line 292, in handle_RETURN
    self.lineReceived(line)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 267, in lineReceived
    more = self.interpreter.push(line)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 106, in push
    more = self.runsource(source, self.filename)
  File "/usr/lib/python3.7/code.py", line 74, in runsource
    self.runcode(code)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 117, in runcode
    code.InteractiveInterpreter.runcode(self, *a, **kw)
  File "/usr/lib/python3.7/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<console>", line 1, in <module>

  File "/usr/lib/python3.7/_sitebuiltins.py", line 26, in __call__
    raise SystemExit(code)
builtins.SystemExit: None

2019-08-05 19:49:20 [twisted] CRITICAL: Unhandled Error
Traceback (most recent call last):
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/log.py", line 103, in callWithLogger
    return callWithContext({"system": lp}, func, *args, **kw)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/log.py", line 86, in callWithContext
    return context.call({ILogContext: newCtx}, func, *args, **kw)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
--- <exception caught here> ---
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/internet/posixbase.py", line 614, in _doReadOrWrite
    why = selectable.doRead()
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/internet/tcp.py", line 243, in doRead
    return self._dataReceived(data)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/internet/tcp.py", line 249, in _dataReceived
    rval = self.protocol.dataReceived(data)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/telnet.py", line 636, in dataReceived
    self.applicationDataReceived(b''.join(appDataBuffer))
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/telnet.py", line 988, in applicationDataReceived
    self.protocol.dataReceived(data)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/telnet.py", line 1035, in dataReceived
    self.protocol.dataReceived(data)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/insults/insults.py", line 537, in dataReceived
    self.terminalProtocol.keystrokeReceived(ch, None)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/recvline.py", line 225, in keystrokeReceived
    m()
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/recvline.py", line 374, in handle_RETURN
    return RecvLine.handle_RETURN(self)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/recvline.py", line 292, in handle_RETURN
    self.lineReceived(line)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 267, in lineReceived
    more = self.interpreter.push(line)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 106, in push
    more = self.runsource(source, self.filename)
  File "/usr/lib/python3.7/code.py", line 74, in runsource
    self.runcode(code)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/manhole.py", line 117, in runcode
    code.InteractiveInterpreter.runcode(self, *a, **kw)
  File "/usr/lib/python3.7/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<console>", line 1, in <module>

  File "/usr/lib/python3.7/_sitebuiltins.py", line 26, in __call__
    raise SystemExit(code)
builtins.SystemExit: None

Ctrl+Z

Unhandled Error
Traceback (most recent call last):
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/log.py", line 103, in callWithLogger
    return callWithContext({"system": lp}, func, *args, **kw)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/log.py", line 86, in callWithContext
    return context.call({ILogContext: newCtx}, func, *args, **kw)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
--- <exception caught here> ---
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/internet/posixbase.py", line 614, in _doReadOrWrite
    why = selectable.doRead()
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/internet/tcp.py", line 243, in doRead
    return self._dataReceived(data)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/internet/tcp.py", line 249, in _dataReceived
    rval = self.protocol.dataReceived(data)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/telnet.py", line 585, in dataReceived
    raise ValueError("Stumped", b)
builtins.ValueError: ('Stumped', b'\xed')

2019-08-05 19:50:14 [twisted] CRITICAL: Unhandled Error
Traceback (most recent call last):
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/log.py", line 103, in callWithLogger
    return callWithContext({"system": lp}, func, *args, **kw)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/log.py", line 86, in callWithContext
    return context.call({ILogContext: newCtx}, func, *args, **kw)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
--- <exception caught here> ---
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/internet/posixbase.py", line 614, in _doReadOrWrite
    why = selectable.doRead()
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/internet/tcp.py", line 243, in doRead
    return self._dataReceived(data)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/internet/tcp.py", line 249, in _dataReceived
    rval = self.protocol.dataReceived(data)
  File "/home/shivam/.local/lib/python3.7/site-packages/twisted/conch/telnet.py", line 585, in dataReceived
    raise ValueError("Stumped", b)
builtins.ValueError: ('Stumped', b'\xed')

To reproduce: Create any basic spider. Login from the telnet , enter exit() to get the first error. CTRL+Z to get the latter. Here's my spider's code:

import scrapy 

class labSpider(scrapy.Spider):
    name='evil'
    start_urls=['https://www.google.com/search?q=final+fantasy'+str(i) for i in  range(1000)]

    def parse(self, response):
        yield{'bin':response.css('span div::text').getall()}