getsentry / sentry-elixir

The official Elixir SDK for Sentry (sentry.io)
https://sentry.io
MIT License
622 stars 183 forks source link

Rate-limited Oban cron monitor check-ins #756

Open rodolfoBee opened 1 month ago

rodolfoBee commented 1 month ago

Environment

SDK version 10.6.0.

Steps to Reproduce

SDK configuration:

config :sentry,
dsn: System.get_env("SENTRY_DSN"),
environment_name: environment,
enable_source_code_context: true,
root_source_code_paths: [File.cwd!()],
integrations: [
oban: [
cron: [enabled: true]
]

Jobs are configured using crontab, for example:

{"1-59/5 * * * *", Cron.DeleteTeam},
{"2-59/5 * * * *", Cron.GenerateFees},

It was tried to manually create the monitor following the auto-config in the oban integration code with no success.

Expected Result

Check-ins are sent and accepted by Sentry following the job's crontab

Actual Result

Check-ins are marked as "Monitor rate limit":

Image

All monitors are active in Sentry and there is available quota.

whatyouhide commented 1 month ago

I’m not sure what Sentry means by "Dropped (Server)". It doesn't give you any details on why the monitor was dropped?

rodolfoBee commented 1 month ago

The only reason given is "Monitor Rate Limit", note no check-in is accepted so usual rate limits (6 checkins per monitor per minute) do not apply here. How exactly is the check-in envelope created by the oban integration and sent by the SDK?

whatyouhide commented 1 month ago

We create the envelope with something along these lines:

[
  ~s({"event_id":"#{event_id}"}\n),
  ~s({"type": "check_in", "length": #{byte_size(encoded_check_in)}}\n),
  encoded_check_in,
  ?\n
]

You can see this code here. Without any server logs telling us what's wrong this is pretty hard to debug. If you report check-ins manually (with Sentry.capture_check_in/1), does it work?

rodolfoBee commented 1 month ago

@whatyouhide thank you for the info. @gaprl from the crons team is also looking into the backend logs

sl0thentr0py commented 1 month ago

I don't think this is an SDK problem but we can wait for more server investigation before closing.

rodolfoBee commented 2 days ago

@getsentry/product-owners-crons can we get an update on this issue?

sl0thentr0py commented 2 days ago

@rodolfoBee as I said, rate limits and the Dropped (Server) reports are purely server side, I don't think it has to do with the SDK. Did the backend team have an update?

EDIT: Ah sorry now saw that you pinged them and not us. :)

whatyouhide commented 2 days ago

It might be worth closing this particular issue so as to not confuse users into thinking it's an issue with the Elixir SDK?

rodolfoBee commented 1 day ago

Can it be transferred to the getsentry/sentry repo instead, so we can assign to the Crons team?

whatyouhide commented 1 day ago

@rodolfoBee even better yes, but I don't seem to have permissions to do that. @sl0thentr0py?