[1.x] Allow thresholds of slow queries, jobs and requests to be customised by regex pattern

laravel / pulse

Laravel Pulse is a real-time application performance monitoring tool and dashboard for your Laravel application.

https://pulse.laravel.com

MIT License

1.43k stars 165 forks source link

[1.x] Allow thresholds of slow queries, jobs and requests to be customised by regex pattern #340

Closed matheus-carvalho closed 4 months ago

matheus-carvalho commented 6 months ago

I'm facing a problem with slow jobs and requests I've already know is slower than others, so it would be useful I could customise the thresholds individually, using regex patterns.

I think it would be useful to anyone who have resources known to be slow, like report jobs, huge-data endpoints or something else, and want to monitor them with a higher threshold, not just ignore them.

I guess this changes will not break any existing feature because we still with a 'default' threshold, which will be used until some individual customisation.

matheus-carvalho commented 6 months ago

Many checks are broken but I can't understand the reason, even with a fresh clone of 1.x branch it was breaking locally

jessarcher commented 6 months ago

Hi @matheus-carvalho,

Thanks for the PR!

I've fixed the failing test. The Livewire payload in the test was incomplete as it was previously inconsequential, but with these changes, the payload needs to be parsed to determine the path for the threshold.

I think this could be a useful feature, but at the moment, it seems like a breaking change for existing users. To move forward and consider this feature, it would need to support the existing config file format where the threshold is a single value rather than an associate array. I'm also not sure whether this new format should be the default in the config file or just something we document.

matheus-carvalho commented 6 months ago

Hi @jessarcher,

Thank you for fixing the tests!

About the problem with the config file, what do you think about creating a previous check? We could test if the value is a array to continue this way, in case of negative we could fallback to the current way, just using the single value.

For this change I would need to change the two methods threshold in both Concerns HasThreshold and Thresholds, and a little adjustment at usage.blade.

If you think it's viable I could submit the changes.

jessarcher commented 6 months ago

@matheus-carvalho Sounds reasonable.

Our other concern is introducing a concept that ties cards to individual recorders with the recorder method. Cards can leverage data from multiple recorders (e.g., the Usage card), so I think the threshold method might need to accept the recorder class or threshold config as an argument.

matheus-carvalho commented 6 months ago

Thank you for your attention @jessarcher.

Alright, this would really be a problem, I'll implement the changes to fix these points and submit the commit.

Please let me know if there are something else worrying you.

jessarcher commented 6 months ago

Thanks @matheus-carvalho!

I haven't run it locally yet to check out the UI additions, but the only other thing I'm wondering about is whether the "Top 10 users experiencing slow endpoints" view in the Usage card should have something to indicate when there are path-specific thresholds in play. I'm imagining a scenario where a user complains about something being "slow", but it doesn't show up on that card even though it's over the default threshold because it was under the path-specific threshold. The person viewing the dashboard may not be aware of the additional configuration.

It's also up to Taylor whether we want to take this feature on, but it looks useful to me!

matheus-carvalho commented 6 months ago

Hello @jessarcher,

I have pushed some improvements to take care of the pointed concerns.

threshold methods now accepts a recorder argument.
check if the threshold config is an array to maintain compatibility.
add a information icon in the actions area of usage card to tell about customised thresholds.

Here are some images from my local tests so you can take a look at UI additions.

Information icon Slow requests Slow outgoing requests Slow jobs Slow queries

matheus-carvalho commented 6 months ago

Hello,

excuse me but I just want to know if there are something else I could do for this PR.

timacdonald commented 6 months ago

We'll review this one shortly and see if we can move it along.

timacdonald commented 4 months ago

Some UI shots here.

I've decided to keep this pretty simple and only change the UI based on the threshold config type. Array = per item threshold reporting. Integer = Global type threshold.

We could have done fancy stuff checking if there was more than one threshold value for the all the listed items - but I didn't like the idea that the UI would change if an entry came into the table that had a different threshold while you were watching it.

Slow requests

'threshold' => 1000,

'threshold' => [
    'default' => 1000,
    '#pattern#' => 10,
],

Application usage card

'threshold' => 1000,

'threshold' => [
    'default' => 1000,
    '#pattern#' => 10,
],

Slow queries

'threshold' => 1000,

'threshold' => [
    'default' => 1000,
    '#pattern#' => 10,
],

Outgoing requests

'threshold' => 1000,

'threshold' => [
    'default' => 1000,
    '#pattern#' => 10,
],

Slow Jobs

'threshold' => 1000,

'threshold' => [
    'default' => 1000,
    '#pattern#' => 10,
],