Closed lmendes86 closed 2 months ago
@jirenius will this not be merged? Would be cool to have some basic metrics
Sorry for the radio silence. The resgate project has been on ice for a while, even if the gateway has been actively used in other projects lead by me. Now I am working on a new release with some further improvements, bugfixes, and updated dependencies. I wish to include this PR in the release, but due to other changes made, I will merge it into a side branch and handle the conflicts there.
Also, thanks for this PR inspiring other improvements!
Testing it, I see one issue with the number of dependencies that comes with the prometheus package. The compiled file size increased with about 60% (6MB), making the server more vulnerable to dependency abuse chains.
One option would be to use a dependency-free package (eg. github.com/bsm/openmetrics ) to expose the desired metrics. The ones I would add would probably be:
process_start_time_seconds
go_memstats_*
(using runtime.MemStats
)go_info
version
resgate_info
- Gauge (set to 1)
version=<resgate version>,
protocol=resgate_ws_current_connections
- Gaugeresgate_ws_connections_total
- Counterresgate_ws_subscriptions
- Gauge
type=direct
type=indirect
resgate_ws_requests_total
- Counter
method=get
method=subscribe
method=call
method=auth
resgate_cached_resources
- Gaugeresgate_http_requests_total
- Counter
method=POST
method=GET
I think I'll skip:
resgate_nats_connected
- Since resgate stops if the connection is closed. So it will always be 1 when successfully scrapingresgate_subscriptions
- (or rather, per resource subscription labels) Some solutions may have many thousand different subscriptions, causing the metrics response to be huge. Possibly have it as an opt in thing through configuration.It's nice to hear that this is being taken care of!
Those metrics look good! We are using resgate_subscriptions
, but I understand that it could lead to many metrics if there are a lot of subscription topics; for us, it is quite insightful to have, so it could be useful to keep it with an opt-in if you think that is a possibility. Here, I leave an example of a Grafana visualization of our current implementation.
Thanks in advance for the work!
Ah, that is nice!
For the grouping of resource IDs, Resgate would need some sort of knowledge of patterns. In your branch, you've solved it by detecting {id} and {uuid} parts. But I will try to see if I can come up with a more generic way to solve it. One way would be to provide resgate with resource patterns to track metrics through configuration:
{
"metrics": {
"resourcePatterns": [
"availability.client.*",
"availability.client.*.user.*",
"availability.client.*.user.*.device",
"availability.client.*.user.*.device.*",
"dashboard.client.*",
"dashboard.queue.*",
"usertoken"
]
}
}
It would require you to manually update resgate's configuration with the resource patterns. So, it might work for some use cases.
Anyway. While I failed to merge your PR into develop
due to me choosing to solve it differently and with a different package, it was still great inspiration in many ways! Big thanks for it!
We are recently running Resgate on our platform; thank you for this repo, by the way. However, we still needed basic metrics from it, so we decided to add the possibility of having a Prometheus exporter in Resgate to have insight into how it is operating. We also upgraded to the latest Golang version, but if required, we can remove those changes from this pull request. I hope this helps!