bensheldon / good_job

Multithreaded, Postgres-based, Active Job backend for Ruby on Rails.
https://goodjob-demo.herokuapp.com/
MIT License
2.62k stars 194 forks source link

Improve Dashboard Charts (feedback welcome) #438

Open bensheldon opened 2 years ago

bensheldon commented 2 years ago

I would love feedback on these charts:

Other charts?

I'd like to make these charts load asynchronously (SJR) so that the dashboard doesn't load too slowly.

aried3r commented 2 years ago
* Have a chart that is simply number of jobs performed successfully, and errored

That would be awesome! When we faced configuration problems we saw the dashboard printing a chart of enqueued jobs, but not executed ones, so it seemed everything was in order at first. Of course we wouldn't rely on just the dashboard for this (see also #403) but in this particular case it would have helped.

* Have some kind of min/max/avg (wish postgres made medians easier) [banded-line chart](https://github.com/gionkunz/chartist-js/issues/283#issuecomment-96365199) of job duration

We are printing some very basic charts internally and were having success using chartkick (which makes lazy loading easy by passing an endpoint rather than data) (or chartkick.js without the Ruby helpers) and active_median. Since Chartist seemed unmaintained, we use either Chartkick or Chart.js directly. I'm not saying charting libraries should be switched, but it's something to keep in mind. Chart.js does support area charts.

As for medians, if you look at the postgres implementation of active_median, it used a 50% percentile which is the median I believe, which is something PostgreSQL offers.

bensheldon commented 2 years ago

@aried3r thanks for sharing that story about an incident. That's helpful to know these things have consequences.

I'll check out Chart.js as an alternative to Chartist. I think I can do the minimum of what is needed without chartkick to try to reduce the dependencies as much as possible.

I think this is the feature I want "Line Datasets" for building a time-series-box-and-whisker-like plot: https://www.chartjs.org/docs/3.2.0/samples/area/line-datasets.html

And thanks for pointing to active_median. That doesn't look too bad, though probably all this stuff (all the charts, not just medians) will be row-scanning the good_jobs table and not the most performant.

bensheldon commented 1 year ago

Continuing to think and work on this. For context, Heroku has these time bucket options:

Screenshot 2022-10-29 at 5 24 11 PM

I really would like to have a cumulative flow diagram.

Some thoughts on what to show:

I dunno quite how to offer both a (Min, Avg, P95, Max)-breakdown, and Queue (or Job) -based breakdowns

sandstrom commented 1 year ago

Thanks for GoodJob, it's an awesome tool!

In the spirit of keeping the scope of this project down though, I wouldn't build a very advanced dashboard for it.

Instead, just expose the relevant metrics (queue length, etc) and let other tools provide the dashboard. These other tools could be Grafana/Prometheus, Cronitor.io (we're about to integrate them with GoodJob right now), BetterUptime, etc.

Better if this projects stays focused on the ruby/jobs side, ideally with only a simple HTML based dashboard (or only a JSON API with stats). Projects that spread themselves too thin tend to run into problems when the 1-2 maintainers enter a period in their life [family] with less time to allocate to an open-source project.

On that topic, try to get more project team members onboard. Perhaps some of the existing contributors would be willing to help out (https://github.com/bensheldon/good_job/graphs/contributors).

jonahgeorge commented 1 year ago

Some additional inspiration: https://twitter.com/sorentwo/status/1666492674684116994?cxt=HHwWhICz9cKQyaAuAAAA