getsentry / raven-python

Raven is the legacy Python client for Sentry (getsentry.com) — replaced by sentry-python
https://sentry.io
BSD 3-Clause "New" or "Revised" License
1.68k stars 657 forks source link

contrib/tornado probably shouldn't try to send files in body to sentry #372

Open IfpnI opened 10 years ago

IfpnI commented 10 years ago
Traceback (most recent call last):
  File "/foo/virtualenv/local/lib/python2.7/site-packages/tornado-3.1.1-py2.7.egg/tornado/web.py", line 1141, in _when_complete
    callback()
  File "/foo/virtualenv/local/lib/python2.7/site-packages/tornado-3.1.1-py2.7.egg/tornado/web.py", line 1163, in _execute_method
    self._execute_finish)
  File "/foo/virtualenv/local/lib/python2.7/site-packages/tornado-3.1.1-py2.7.egg/tornado/web.py", line 1157, in _when_complete
    self._handle_request_exception(e)
  File "/home/fpn/cray/inferno/inferno/web.py", line 167, in _handle_request_exception
    super(RequestHandler, self)._handle_request_exception(e)
  File "/foo/virtualenv/local/lib/python2.7/site-packages/tornado-3.1.1-py2.7.egg/tornado/web.py", line 1195, in _handle_request_exception
    self.log_exception(*sys.exc_info())
  File "/foo/virtualenv/local/lib/python2.7/site-packages/raven-3.5.0-py2.7.egg/raven/contrib/tornado/__init__.py", line 252, in log_exception
    self.captureException(exc_info=(typ, value, tb))
  File "/foo/virtualenv/local/lib/python2.7/site-packages/raven-3.5.0-py2.7.egg/raven/contrib/tornado/__init__.py", line 242, in captureException
    return self._capture('captureException', exc_info=exc_info, **kwargs)
  File "/foo/virtualenv/local/lib/python2.7/site-packages/raven-3.5.0-py2.7.egg/raven/contrib/tornado/__init__.py", line 239, in _capture
    return getattr(client, call_name)(data=data, **kwargs)
  File "/foo/virtualenv/local/lib/python2.7/site-packages/raven-3.5.0-py2.7.egg/raven/base.py", line 594, in captureException
    'raven.events.Exception', exc_info=exc_info, **kwargs)
  File "/foo/virtualenv/local/lib/python2.7/site-packages/raven-3.5.0-py2.7.egg/raven/contrib/tornado/__init__.py", line 36, in capture
    self.send(callback=kwargs.get('callback', None), **data)
  File "/foo/virtualenv/local/lib/python2.7/site-packages/raven-3.5.0-py2.7.egg/raven/contrib/tornado/__init__.py", line 44, in send
    message = self.encode(data)
  File "/foo/virtualenv/local/lib/python2.7/site-packages/raven-3.5.0-py2.7.egg/raven/base.py", line 563, in encode
    return base64.b64encode(zlib.compress(json.dumps(data).encode('utf8')))
  File "/foo/virtualenv/local/lib/python2.7/site-packages/raven-3.5.0-py2.7.egg/raven/utils/json.py", line 43, in dumps
    return json.dumps(value, cls=BetterJSONEncoder, **kwargs)
  File "/usr/lib/python2.7/dist-packages/simplejson/__init__.py", line 334, in dumps
    **kw).encode(obj)
  File "/usr/lib/python2.7/dist-packages/simplejson/encoder.py", line 237, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python2.7/dist-packages/simplejson/encoder.py", line 311, in iterencode
    return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x90 in position 197: invalid start byte

My quick fix is:

def get_sentry_data_from_request(self):
    """
    Extracts the data required for 'sentry.interfaces.Http' from the
    current request being handled by the request handler

     :param return: A dictionary.
    """
    if len(self.request.files)>0 or len(self.request.body) > 200000:

        files = {k:[{dk:dv for dk, dv in d.iteritems() if dk!='body'}for d in v] for k,v in self.request.files.iteritems()}
        data = { 'arguments': self.request.arguments, 'files': files } 
    else:
        data = self.request.body
     return {
        'sentry.interfaces.Http': {
            'url': self.request.full_url(),
            'method': self.request.method,
            'data': data,
            'query_string': self.request.query,
            'cookies': self.request.headers.get('Cookie', None),
            'headers': dict(self.request.headers),
        }
    }
jshirley commented 9 years ago

This would be good to get merged, we sometimes have massive bodies (usually those that cause exceptions) and cannot send them out.

Is there any objection to this, otherwise I can submit a PR for it.

xordoquy commented 9 years ago

@jshirley If possible a test would be great before we can merge this.

jshirley commented 9 years ago

After looking at this, and comparing to our own scenarios, I think it makes a lot more sense to instead just allow people to compose their own filters.

The approach I'll submit via PR shortly will just wrap 'data' : self.get_sentry_request_body() and people can overload get_sentry_request_body to filter for their particular applications.

If we really do want to not post files, then it will be easier to do it as a separate method, too.

Sound good?