Open dmacvicar opened 6 years ago
What about https://metrics.opensuse.org/ ? :-)
That would be perfect. We would still need a prometheus instance to gather the metrics. We can use metrics.opensuse.org to display them.
We would still need a prometheus instance to gather the metrics.
If we only want Rails middle ware stats there is also influxdb-rails.
For the other data we can send things out to rabbit.opensuse.org, consume with telegraf, write to influxdb (make it possible for others to use this data from script or whatnot). Or sending things out with influxdb-ruby.
Short status update: I did some experiments with the Prometheus Exporter. I was able to export basic ruby metrics and visualize them with grafana. I still want to explore the influx options suggested by @hennevogel .
@hennevogel Do we have an openSUSE instance of InfluxDB already? or do you mean running one in the same machine? (in that case it would not make a difference to use Prometheus).
@dmacvicar rabbitmq runs on rabbit.o.o. and metrics.o.o runs telegraf, influx and grafana already.
@hustodemon we could ask @jberry-suse if we can use the InfluxDB in metrics.opensuse.org, or whether we can run prometheus there. https://bitworking.org/news/2017/03/prometheus
I'm sure you can. We (OBS team) will also start to use it soon :-)
I would really prefer to go the Prometheus way (pull), and also because of the internal knowledge we have inside of SUSE (used for SUSE Manager, Storage, Containers)
Prometheus does not bother me. As far as pull, that's how influxdb is getting the data right now.
Presumably not talking about major resource usage as any increases will need to be requested. The plan is to manage via salt, but nothing ever came of previous meetings to achieve that. If folks have interest in converting the configuration that would be great. Otherwise, if you provide the necessary config I can install on the machine or potentially grant someone access, but that can get messy with too many chefs in kitchen.
The tooling for pulling data and the grafana dashboard and data source definitions are providing via a package on OBS which would be ideal for software-o-o to do as well. That way all that is outside of proper versioning (somewhere) is the firewall config, grafana/influxdb config, and list of packages installed on machine.
For an example of the package layout that uses grafana provisioning see https://github.com/openSUSE/openSUSE-release-tools/blob/820d1030e54f5c9bbfe9aeb69ca5b3b44a838aaa/dist/package/openSUSE-release-tools.spec#L445-L461.
Hi @jberry-suse , I wrote a simple salt state that installs prometheus on a machine and makes sure it's running. I don't have much experience with packaging, but IIUC you'd prefer creating some kind of pseudopackage which contains some prometheus config and which makes sure the prometheus is installed (via Requires
). Is that right?
Salt is fine. The packaging is for configs/scripts coming out of this repo, but if very minor like point at your endpoint could also be done via salt.
status update: I pinged the openSUSE Heroes about creating a new machine for us, let's see how this turns out.
Different from metrics.o.o?
Based on emails I thought were going to add to openSUSE salt master and have it thus installed on metrics.o.o.
If you need ssh access to debug and get things working I can provide if you let me know what pubkey to use.
I see. We don't really care where Prometheus is going to be installed, metrics.o.o would be also fine. I'll update my ticket, then.
Right now there are two use-cases where we need some kind of monitoring and metric tracking:
Therefore, I suggest we look into enabling the application to be scrapped by prometheus, which is a popular solution nowadays, and easily integrated then with graphana or other dashboards.
This means enabling a
/metrics
endpoint in the application. Initially we could use one of our internal prometheus installations.https://prometheus.io/docs/prometheus/latest/getting_started/
For rails apps, enabling it could be as simple as using the Rack middleware:
However there are some showstoppers when using puma/multi-process servers than need to be investigated, as not all client implementations store the metric correctly in these situations, and there may be alternative solutions for these cases.