getsentry / sentry

Developer-first error tracking and performance monitoring
https://sentry.io
Other
38.63k stars 4.14k forks source link

Sentry is dropping transactions and profiles #60739

Closed williamdes closed 9 months ago

williamdes commented 10 months ago

Self-Hosted Version

23.11.1

CPU Architecture

x86_64

Docker Version

20.10.24

Docker Compose Version

2.19.0

Steps to Reproduce

Since recent versions Sentry is dropping my transaction and profiles. But not issues.

I upgraded from 23.8.0 to 23.10.1 on the day he graph above shows dropping. With Wireshark I confirmed that envelopes are 200 OK by the relay.

The quota is set to 0.

Expected Result

Do not drop the sent data

Actual Result

image

Event ID

No response

getsantry[bot] commented 9 months ago

Assigning to @getsentry/support for routing ⏲️

getsantry[bot] commented 9 months ago

Routing to @getsentry/product-owners-profiling for triage ⏲️

getsantry[bot] commented 9 months ago

Routing to @getsentry/product-owners-performance for triage ⏲️

phacops commented 9 months ago

Hi @williamdes,

Can you tell me from which platforms and versions of the SDK the events are coming from?

Also, by any chance, can you check the value of the timestamp and start_timestamp field on the transaction? We started rejecting transactions with negative timestamp values in 23.10.0, wondering if it's somehow related.

Could you also turn on debug logs in Relay? Some more information about why they're being dropped should be logged and would help us determine the issue. Otherwise, if you give me an envelope sent by the SDK, I can probably try to reproduce on my end.

Your profiles are dropped because transactions are dropped, so, once we find out why transactions are dropped, profiles should be fine.

williamdes commented 9 months ago

Hi @williamdes,

Hello 👋🏻

Can you tell me from which platforms and versions of the SDK the events are coming from?

sentry/sdk  3.5.0
sentry/sentry   3.22.1
sentry/sentry-laravel   3.8.2

Also, by any chance, can you check the value of the timestamp and start_timestamp field on the transaction? We started rejecting transactions with negative timestamp values in 23.10.0, wondering if it's somehow related.

Hmm, no. But could a server out of time sync be the culprit ?

Could you also turn on debug logs in Relay? Some more information about why they're being dropped should be logged and would help us determine the issue. Otherwise, if you give me an envelope sent by the SDK, I can probably try to reproduce on my end.

I will try to see if I can get this info

Your profiles are dropped because transactions are dropped, so, once we find out why transactions are dropped, profiles should be fine.

Got it

williamdes commented 9 months ago
2023-11-29T10:12:43.317402Z DEBUG request{method=POST uri=/api/4/envelope/ version=HTTP/1.1}: tower_http::trace::on_request: started processing request
2023-11-29T10:12:43.317749Z DEBUG request{method=POST uri=/api/4/envelope/ version=HTTP/1.1}: tower_http::trace::on_response: finished processing request latency=0 ms status=200
2023-11-29T10:12:43.318272Z DEBUG relay_server::utils::managed_envelope: dropped envelope: invalid data (invalid_transaction)
2023-11-29T10:12:43.318595Z DEBUG relay_server::utils::managed_envelope: dropped envelope: invalid data (invalid_transaction)
getsantry[bot] commented 9 months ago

Routing to @getsentry/product-owners-settings-relay for triage ⏲️

k-fish commented 9 months ago

@williamdes hey, so it looks like Relay is dropping those transactions due to invalid data, that could be happening for a few different reasons when we try to normalize the event, I'm not sure why source of the invalid message isn't being logged, but I've pinged our Relay folks to take a look.

Dav1dde commented 9 months ago

I've looked into this, but I need an example envelope or additional logging to further analyse this.

If you're already running wireshark, can you capture a few envelopes that go wrong?

iker-barriocanal commented 9 months ago

Hi @williamdes, did you have a chance to capture a payload so that we can debug it? Thanks!

williamdes commented 9 months ago

Hi @williamdes, did you have a chance to capture a payload so that we can debug it? Thanks!

Hi,

Sorry for the delay, here is one capture:

I had to truncate the part where there is too much sensitive data, can you provide me an email or GPG key to send the data to ?

POST /api/4/envelope/ HTTP/1.1
Host: sentry.hosting.tld
X-Forwarded-For: xxxxxxxxx,xxxxxxxxx, 172.22.0.1
X-Forwarded-Proto: https
X-Request-Id: xxxxxxxxx
Content-Length: 4630
cdn-loop: cloudflare
cf-ipcountry: FR
accept-encoding: gzip
cf-ray: 830c768ec8e1d646-CDG
cf-visitor: {"scheme":"https"}
user-agent: sentry.php.laravel/3.8.2
x-amz-cf-id: xxxxxxxxxxxxx
via: 1.1 6bbb2da0f4f203dadcd8f0ae1073d674.cloudfront.net (CloudFront)
accept: */*
content-type: application/x-sentry-envelope
x-sentry-auth: Sentry sentry_version=7, sentry_client=sentry.php.laravel/3.8.2, sentry_key=cccccccccccccddddddeeee
content-encoding: gzip
cloudfront-viewer-http-version: 1.1
cloudfront-forwarded-proto: https
cloudfront-viewer-address: xxxxxxxxx
cloudfront-viewer-tls: TLSv1.3:TLS_AES_128_GCM_SHA256:fullHandshake
cloudfront-viewer-asn: 16509
cf-connecting-ip: xxxxxxxxx

{"event_id":"669b7e12db664886aba9e936c1fac28b","sent_at":"2023-12-05T12:58:52Z","dsn":"https:\/\/cccccccccccccddddddeeee@reporting.domain.tld\/4","sdk":{"name":"sentry.php.laravel","version":"3.8.2"},"trace":{"trace_id":"a6a42cd58c974041859c0ae8148a4cd8","sample_rate":"0.4","transaction":"\/xx\/xx\/xx\/xx-infos","public_key":"cccccccccccccddddddeeee","release":"api_production_2023-11-30-1","environment":"production","sampled":"true"}}
{"type":"transaction","content_type":"application\/json"}
{"event_id":"669b7e12db664886aba9e936c1fac28b","timestamp":1701781132.540913,"platform":"php","sdk":{"name":"sentry.php.laravel","version":"3.8.2"},"start_timestamp":1701781132.447612,"logger":"php",[...]
"data":{"db.name":"xxxx","db.system":"mysql","server.address":["xxxx.cluster-xxxx.eu-west-3.rds.amazonaws.com"],"server.port":"3306","db.sql.origin":"\/app\/Support\/Database\/Builder.php:46"}}],"transaction_info":{"source":"route"}}
iker-barriocanal commented 9 months ago

@williamdes you can send the full payload to my email in [first name].[last name][at]sentry.io, thanks beforehand!

williamdes commented 9 months ago

@williamdes you can send the full payload to my email in [first name].[last name][at]sentry.io, thanks beforehand!

Sent to your email

williamdes commented 9 months ago

It looks like I fixed this with a network firewall change. Does it require some external ressources to work ? Maybe I had my routing wrong. But for me this is weird because it should work regardless of the routing to the Internet. Since we know the request has arrived with success :thinking:

jjbayer commented 9 months ago

@williamdes the only external factor I could imagine to cause invalid_transaction is when timestamps in your network (NTP?) did not match timestamps in the SDK.

williamdes commented 9 months ago

@williamdes the only external factor I could imagine to cause invalid_transaction is when timestamps in your network (NTP?) did not match timestamps in the SDK.

Indeed! That was it, the network out issue was preventing NTP sync. And so Sentry started to deny requests as the data got out of time sync. Maybe it is worth adding some check or alert into Sentry to warn about out of time data? That could be a wrong date on the sender or on the receiver.

jjbayer commented 9 months ago

Maybe it is worth adding some check or alert into Sentry to warn about out of time data?

@williamdes for error events, we try to correct large clock differences automatically (delta between sent_at and received_at), so in Issues you should see a warning like "There was 1 problem processing this event".

We should probably do the same for transaction events, I raised an issue here: https://github.com/getsentry/relay/issues/2851

williamdes commented 9 months ago

yeah, warning it better than drop. I had something like 5 minutes of clock drift.

jjbayer commented 9 months ago

Seems resolved, @williamdes feel free to reopen or comment if you have more questions.

williamdes commented 9 months ago

Okay, will follow https://github.com/getsentry/relay/issues/2851

williamdes commented 9 months ago

@iker-barriocanal can you delete my email data please?

iker-barriocanal commented 9 months ago

Done! Deleted the email from the inbox and from the trash. I didn't end up downloading the payload so there's nothing on my machine either. Thank you for providing it in the first place anyway!