getsentry / sentry-javascript

Official Sentry SDKs for JavaScript
https://sentry.io
MIT License
8.01k stars 1.58k forks source link

Document that there's a limit in the size of the data that can be sent #339

Closed cmdelatorre closed 6 years ago

cmdelatorre commented 9 years ago

As long as I understand, after reading #58 and other issues, raven-js uses HTTP GET requests to send data to Sentry. Apparently (and unlucky for me) it is a design decision related to the difficulties of handling CORS.

In any case, the fact is that there's a strict limit on the size of the error reports, that depends on the max url size supported by the Sentry web server. As I'm using the getsentry service, there's nothing I can do about it.

So, careless of how much I'd love to be able to POST longer reports to Sentry, I'd strongly recommend that you document this issue properly. I consider it to be very important and should be somewhere in the official documentation of the API.

Finally, thanks a lot for this great product.

mattrobenolt commented 8 years ago

This has changed drastically in 2.0.0 since we now send data via POST.

But, there is still a limitation depending on your server. If you're using on-premise, it's impossible to determine, but if you're talking to our servers (app.getsentry.com), our limit as of now is 100KB for the body.

If the message is too large, it'll be rejected.

@benvinegar We should add this to docs now.

jakubzitny commented 8 years ago

Hey guys, have you considered adding a check for the Sentry message size limit into Raven?

What do you think would be the best way to check / truncate the extra data that's over the limit?

(cc @benvinegar @mattrobenolt)

mattrobenolt commented 8 years ago

sentry.io has a limit that is effectively arbitrary for ourselves. This limit doesn't apply and may different for anyone running Sentry on-premise. Our limit as of today is 100k, but it's possible that in the future we may increase this size.

Now, there's no way to communicate this back to the server until it fails to make the request and is rejected. In theory, we could change the error that's kicked back to be JSON and be something like, {"error":413, "limit":100}, then trim and retry with that information it got. But this would be sorta specific to our implementation since this happens at our load balancer and not in the Sentry server itself.

it's also worth noting that we don't deal with this in other clients nearly as often since they all gzip their payloads before transmitting, so it's very rare to cross our 100k limit. But for JavaScript, we're not able to do this easily without bloating raven-js with a gzip library. (Ironic that we can't easily gzip from a browser, huh?)

jakubzitny commented 8 years ago

Thanks for the reply @mattrobenolt.

I understand the situation, but what if this was configurable? When configuring or instantiating the Raven client I'd add a parameter specifying the limit for one event and the client would automatically trim the contents somehow. Would it make sense?

(cc @vojtatranta)

benvinegar commented 8 years ago

There's also not a really easy way to say "limit this to 100kb". Users can pass extra values with arbitrarily deep objects and sub-objects, so we'd need an algorithm that basically traverses the entire tree and decides where to trim large items. And there's a ton of tricky cases to handle, e.g. how does it know that an array of 100 small items is less bytes than an array of 10 items with large items? This algorithm could become really complex, really fast.

We've experimented with payload size, and found that for a typical maxed out stack trace, and 100 breadcrumbs, the payload is roughly ~15 KB. That's not a lot of bytes for a ton of information. Users only really hit the 100kb limit if they add arbitrarily large arrays/objects to extra manually.

I think if anything, we should add a note to the extra config warning that there is an upper size limit, and users shouldn't indiscriminately add complex objects of unknown size.

jakubzitny commented 8 years ago

Yeah that's true.

Actually, the large objects in extra is exactly our case. We're attaching some logs (little bit similar to breadcrumbs), that can really be "misused" by a developer and filled with unneccessarily big objects. That's why we're contemplating about the limits there and what'd be the best way to trim it.

davishmcclurg commented 7 years ago

I just ran into this with an error that had a bunch of breadcrumbs. It would be nice if the client would re-send the error without breadcrumbs or extra data so that it doesn't disappear completely.

davidfurlong commented 7 years ago

With redux sentry middleware our entire redux state gets sent in extra. Now we're getting this. Any idea of a current estimate on max size? Looks like one request was around 240KB

kamilogorek commented 7 years ago

Any idea of a current estimate on max size?

It's currently 100kB. You can use dataCallback to strip something that's unnecessarily big and you know you can skip it.

davidfurlong commented 7 years ago

It's currently 100kB. You can use dataCallback to strip something that's unnecessarily big and you know you can skip it.

Yeah thanks I found the relevant docs. isn't dataCallback a config option, and not for individual raven _sends? Is the best practice to set the config and then unset it for an individual request? Additionally I feel that the best policy is to report the error to sentry without the heavy extra payload (or breadcrumbs or stack) rather than simply failing on the client... As many errors which can be caught in the use error reporting libraries should be reported to the error monitoring service IMO,

kamilogorek commented 7 years ago

It is indeed a global callback, which will be called for every event.

Additionally I feel that the best policy is to report the error to sentry without the heavy extra payload (or breadcrumbs or stack) rather than simply failing on the client...

It's not that easy to do. We cannot simply calculate the size of the request and strip something if it's too large. We have to be performant and low in size. Adding something like this would require serialization of a whole payload, calculating the size and recursively trying to get it down to "sendable" size, which can be just too heavy process if someone will send tenths of errors in a very short period of time.

davidfurlong commented 7 years ago

Yeah I see that that is an issue - I looked at doing this myself. But Im not suggesting trying to prune dynamic parts of the data - Im suggesting only sending the parts that are truly essential and which cannot collectively be too big. My suggestion would be: on a '413 Request entity too large' sentry retries the same request, but this time without sending extra, breadcrumbs and stack (and any other data which could be (realistically) unbounded in size).

On Thu, Oct 19, 2017 at 1:32 PM, Kamil Ogórek notifications@github.com wrote:

It is indeed a global callback, which will be called for every event.

Additionally I feel that the best policy is to report the error to sentry without the heavy extra payload (or breadcrumbs or stack) rather than simply failing on the client...

It's not that easy to do. We cannot simply calculate the size of the request and strip something if it's too large. We have to be performant and low in size. Adding something like this would require serialization of a whole payload, calculating the size and recursively trying to get it down to "sendable" size, which can be just too heavy process if someone will send tenths of errors in a very short period of time.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/getsentry/raven-js/issues/339#issuecomment-337880491, or mute the thread https://github.com/notifications/unsubscribe-auth/AAlhcP5xGZObiITHpqQV039kBtdbPaY4ks5stzNIgaJpZM4D2Amd .

captbaritone commented 7 years ago

Clearly the best solution to the problem of excessively large payloads, is to trim them down, and we have nice APIs to do just this (Raven.setDataCallback). However, it can still be difficult to know exactly how to transform your data. This difficulty is compounded by the fact that you must err on the side of extreme caution since if you do go over, it fails silently. As users of Raven+Sentry this is the last thing we want! 😄

So, at the very least it would nice to know when you've sent an exception that was rejected by Sentry. I understand the infeasibility of doing this on the server, but as @davidfurlong points out, it could easily be done on the client.

I took a look at implementing this outside the library, but I don't think it's currently possible without depending upon some private implementation details of Raven.

Something like this does seem to work, but it relies on being able to access the default transport implementation via Raven._makeRequest (Not a good solution)

Hack:

function captureMessageWithoutContext(message) {
  Raven.setDataCallback((data, originalCallback) => {
    Raven.setDataCallback(originalCallback);
    data.breadcrumbs.values = [];
    return data;
  });
  Raven.captureMessage(message);
}

const originalTransport = Raven._makeRequest;
Raven.setTransport(opts => {
  const originalOnError = opts.onError;
  opts.onError = error => {
    originalOnError(error);
    if (error.request.status === 413) {
      captureMessageWithoutContext(
        "Failed to submit exception to Sentry. 413 request too large. " + error
      );
    }
  };
  originalTransport(opts);
});

I think if we included the actual XHR error, or even just the status code, in the ravenFailure event, it would be possible to implement this kind of recovery outside of Raven itself.

413 detection may very well be something that should live inside of Raven, but in the mean time, this would be a nice escape hatch and would let the community explore ways to handle 413 errors.

This topic is especially interesting to me, since raven-for-redux force users to manually trim down their data in order to avoid hitting the size limit, and I currently feel like I may be doing more hard then good, since I am leading them down a path which may very well be silencing real errors in their apps.

davidfurlong commented 7 years ago

413 detection may very well be something that should live inside of Raven, but in the mean time, this would be a nice escape hatch and would let the community explore ways to handle 413 errors.

This topic is especially interesting to me, since raven-for-redux force users to manually trim down their data in order to avoid hitting the size limit, and I currently feel like I may be doing more hard then good, since I am leading them down a path which may very well be silencing real errors in their apps.

I agree that 413 should live inside of Raven, but even just making it easier to implement your own 413 handling would be a big improvement.

Currently struggling to balance:

The issue is that our redux state can vary tremendously in size & 413s can be occurring despite the trims. Redux state is a nice to have when debugging, but if it leads to a nontrivial probability the error wont be reported due to a 413, then its not worth it IMO. So no redux state in sentry :(. Sending just a bounded size subobject of the redux state is tricky and much less useful than having a more complete version. Additionally trimming dynamically makes it harder to debug because you don't know whether missing data is the source of the bug, or just a result of trimming..... I almost want encode and compress the whole state

kamilogorek commented 6 years ago

We can most likely utilize Node's serializer here as well. Just a note to myself.

Akash91 commented 6 years ago

Why is this not there in documentation ?

wjdp commented 6 years ago

Just come across this after several users sent in support requests about a frontend application. I'm rather surprised raven doesn't handle this given sentry's tagline "Stop hoping your users will report errors".

Also a hit to our adoption of sentry!

(We're getting big-ish breadcrumbns ~600k when catching Vue exceptions)

beaugunderson commented 6 years ago

with recent Chrome the limit is even smaller (65536 bytes) due to a bug: https://github.com/getsentry/sentry-javascript/issues/1464

edit: not a bug! part of the fetch() spec, so { keepalive: true } should probably not be the default, or the docs updated to reflect that limit.