Open manuellazzari-cargoone opened 7 months ago
Assigning to @getsentry/support for routing ⏲️
Routing to @getsentry/product-owners-issues for triage ⏲️
Likely to be caused by https://github.com/getsentry/sentry-python/issues/2386
Likely to be caused by #2386
@vartec how is that connected? My app does not crash, only the Sentry SDK is complaining...
I am getting the same issue as above, but I haven't yet tried the custom HttpTransport
.
Basic specs are:
sentry-sdk[flask,django,celery]==1.29.0
python3.9
Hey @manuellazzari-cargoone and @kieran-sf! Thanks for reporting this.
My first suggestion would've been exactly what you tried @manuellazzari-cargoone with the custom socket options from https://github.com/getsentry/sentry-python/issues/1198#issuecomment-1802638463. Are you seeing at least some improvement?
I'm curious whether spacing out the events sent from the SDK makes a difference. Can you try overriding _send_request
in your custom transport roughly like this?
import time
class KeepAliveHttpTransport(HttpTransport):
def _send_request(self, *args, **kwargs):
time.sleep(0.01)
super()._send_request(self, *args, **kwargs)
@sentrivana thanks for getting back -- when trying the custom socket settings I wasn't able to see any major differences. Now I'm testing the whole package (custom socket settings and custom send request).
@sentrivana the issue is still there even by adding the custom _send_request
. From a qualitative perspective, it seems that it appears more or less with the same frequency (see below an extraction from our logs before and after the changes were deployed).
2024-02-16T15:29:33.048565197Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:30:55.565954424Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:33:13.537149008Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:33:17.914818994Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:34:47.778557487Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:34:47.819326396Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:37:33.774480759Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:37:34.586534890Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:41:53.741483570Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:45:11.560933696Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:51:10.940568800Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:51:11.016053254Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:51:20.146324668Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:51:28.814144310Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:58:53.136957655Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:58:54.053616565Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:59:16.002942751Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T16:03:12.954825518Z Initializing Sentry SDK... <-- changes deployed
...
2024-02-16T16:13:44.713686531Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T16:22:30.285920011Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T16:25:48.997920085Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T16:25:49.389493362Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T16:26:00.889221302Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T16:38:44.314760846Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
@manuellazzari-cargoone Thanks for the follow up. It looks like the sleep might've had a tiny effect, at least from the logs it looks like there's less occurrences if you look at a comparable time span before and after -- but this could obviously have to do with traffic etc., so don't think this is exactly conclusive.
Are you seeing any network errors for outgoing requests anywhere else in your system? Just trying to rule out general network instability. Alternatively I'm thinking if there's anything special about the errors/transactions -- maybe them being unusually big so it takes long to send each one and the server drops the connection? Fiddling around with the socket options might make a difference, too.
Are you seeing any network errors for outgoing requests anywhere else in your system? Just trying to rule out general network instability.
I'm observing this issue just with 2 services both running with gunicorn
. Also it's all GCP, so I would rule out consistent network instability.
Alternatively I'm thinking if there's anything special about the errors/transactions -- maybe them being unusually big so it takes long to send each one and the server drops the connection?
I'm not sure about size, but I know for sure we have services deal with a lot more traffic and a lot bigger packages. In particular, one of the affected services is dealing with very little traffic and small packages and still experiencing the problem.
Thanks! I think in general we need to reconsider our transport logic and possibly add retries for more cases. (Currently we only retry if we're rate limited.) But at the same time, with this particular error there's no telling if the server actually processed the event and just crapped out at the very end, so it's not clear cut.
Environment
SaaS (https://sentry.io/)
Steps to Reproduce
I am randomly getting the following error from some of my Python services. The SDK seems it is sending events correctly to sentry.io. No clue if it is skipping some of them. The error seems to occur more with load.
The service is mainly a Flask Python 3.11 g, run by
guincorn
on a kubernetes cluster in multiple instances on Google Cloud. All dependencies are basically up to date, and I'm usingsentry-sdk-1.40.2
andurllib3-2.0.7
.In the attempt of fixing it, I tried to customize as follows the
HttpTransport
used by Sentry SDK, with no luck.Expected Result
No connection errors
Actual Result
Product Area
Issues
Link
No response
DSN
https://1e425c4937e14585ab35335aa4810004@o155318.ingest.sentry.io/4503976860385280
Version
No response