getsentry / sentry-php

The official PHP SDK for Sentry (sentry.io)
https://sentry.io
MIT License
1.84k stars 452 forks source link

Implement Auto Session Tracking #1290

Closed antonpirker closed 2 years ago

antonpirker commented 2 years ago

Sentry can monitor the health of releases by checking session data it receives from the SDK. In other SDKs this session data is already automatically collected. In SDK for Python and now brand new for Ruby. We also want to have this for PHP.

For reference see:

We want to implement request mode sessions which are aggregated in the SDK (as compared to application mode sessions which are sent as soon as they finish.) The main reason for this is the scale of most PHP servers out there which would overload the Sentry ingestion pipelines.

In the Ruby and Python implementation a session is one request-response cycle and there is a SessionFlusher that runs in a separate thread that collects the session data and sends it to the server once a minute in bulk.

Basically what this feature should do when enabled:

Due to the fact that PHP is single threaded this issue should be the start of a discussion on how this can be achieved, if it can be achieved at all.

You can also have a look on how this was done in Ruby: https://github.com/getsentry/sentry-ruby/pull/1715

stayallive commented 2 years ago

For reference; #1254 is related to the issue of having a storage/buffer over multiple requests and flusher task in PHP. This is probably impractical without external (buffer) storage and task runner.

Jean85 commented 2 years ago

Exactly. In PHP we do not have any native way to bulk those information somewhere, because we cannot assume any external infrastructure or thread that we could leverage to do something like this out of the box.

Maybe something is feasible in the framework integrations, but that too would require a couple of assumptions or something that has to be manually enabled.

antonpirker commented 2 years ago

Do Laravel or Symfony have something that can be used out of the box? Having this only in one framework is also a possiblity.

smeubank commented 2 years ago

I wonder if we are building up a number use cases for an sidecar approach. What that might be is for sure open for discussion. But let's say, self-hosted relay to send such information to and it handles complexities needed for client reports, session tracking and other future features needed for gathering performance type data to aggregate and send to sentry

edit: not a real agent but a sidecar service

mfb commented 2 years ago

For apps that have a queue worker (i.e. a lot of apps though not all), the app could give the SDK a callback for adding items (json, or json plus any other metadata needed to send the request) to the queue, and then the queue worker needs an SDK method(s) to aggregate and send off the items. This logic could actually be used for all Sentry events so it's possible to send them separately from the request process (which you often want to keep free for handling requests), and aggregated together for efficiency. This is basically the app bringing its own agent, I guess.

antonpirker commented 2 years ago

Sounds like a idea on how to do this. Could the SDK discover the queue worker by itself and hook into it, so the SDK can use the queue worker without the user needing to set anything up by hand? And could you estimate how many PHP projects have this queue worker? It it something you setup right when you do your first "hello world" or is it something you only have when you have millions of users and a team of >5 programmers working on a project?

mfb commented 2 years ago

I don't think there is any commonality between how frameworks setup queues and the SDK is pretty abstract, so I think this would have to happen at the level of integration plugins/libraries that have more awareness of the particular app/framework they are running in.

mfb commented 2 years ago

But if the SDK made it possible, then that integration could wire it up.

Jean85 commented 2 years ago

And could you estimate how many PHP projects have this queue worker? It it something you setup right when you do your first "hello world" or is it something you only have when you have millions of users and a team of >5 programmers working on a project?

I would say that it's something that happens a lot with bigger dev teams and apps. Having background workers is becoming more common thanks to libraries like Symfony Messenger, but it's still something that you put in your app long after the launch of your app, and it's set up manually.

Auto discovery is probably partially doable in the Symfony integration, but you would still inject workload into the users queues, and that could be troublesome; it would highly object to have that in opt-out mode.

HazAT commented 2 years ago

I think in order for that to make it applicable/accessible in most cases, we might need to go down the route that @smeubank mentioned briefly -> Agent (Relay). While we do already support Relay as a kind of an acting Agent, there is still a lot of room for improvement to make the experience more seamless. For example, we could do something like Scout APM and download the agent in the background on first-time use and then run it on the side https://github.com/scoutapp/scout-apm-php/blob/eaf275883dd2640ea2ad9ed6e568314554e334f0/src/CoreAgent/Downloader.php#L100

I am not saying this is the way, I am just not sure if adding support for Sessions only for Laravel and only if you run background queues/workers makes sense.

ste93cry commented 2 years ago

For what is worth, as a user I would never ever want something to be downloaded in the background on my behalf and ran without me knowing about it. I would rather prefer to have a real agent (as a PHP extension or as an external dependency), even if it means that out of the box I have one more manual step to do to set it up

Jean85 commented 2 years ago

I agree with @ste93cry; and in fact any other service that I tried that works with performance & monitoring (Blackfire.io, NewRelic, DataDog) goes with an extension (that eventually spawns a background process) or with a clear dedicated agent to be deployed.

mfb commented 2 years ago

Yeah I think this would have to just be something SDK could provide infrastructure for, not fully automatic functionality. I maintain the Sentry integration for Drupal, and if it was possible to add Sentry events to Drupal's queue subsystem, I'd definitely have to provide various opt-in configurations around that e.g. to make sure someone didn't understand what was going on and flood their mission-critical queue with unexpected stuff :)

A PHP extension definitely makes sense to make performance tracing instrumentation easier; if it existed then could be leveraged for other functionality as well such as this..

antonpirker commented 2 years ago

Thanks everyone for the input. Really amazing!

tl;dr: The agent is probably the best way to go.

I will close this issue now and we will start discussions about the agent approach in Sentry. When we have any news, you will be the first to know! Thanks again!

github-actions[bot] commented 2 years ago

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

cleptric commented 2 years ago

Closing this for now, we might revisit this in the future.