Closed TheChrisTech closed 2 years ago
Weird, I'm not seeing this with my nodes.
If you can share docker logs storj-exporter
that might help.
It may be a platform issue, I only test on amd64
, what hardware are you running on?
Yup, same here. All 84 nodes not logging to Prometheus anymore. 81x amd64 image and 3x ARMv7... Previously versions worked good... since 2.x the whole exporting is crippled... Used dashboard => https://grafana.com/grafana/dashboards/13896
Please roll-back the 2.x release and start this 2x release in beta
docker logs storj-exporter => No logging available...
Output of CURL from Storj-Exporter 2.x -crippled- release; https://pastebin.com/6LssrtPf
Output of CURL from Storj-Exporter previously -working- release; https://pastebin.com/VUWGxr6f
Can you try if this one works https://github.com/anclrii/Storj-Exporter-dashboard/blob/master/Storj-Exporter-Boom-Table.json?
There were some deprecated metrics removed in favour of labels in v2 and https://grafana.com/grafana/dashboards/13896 is probably still using the old ones. We can ask @kevinkk525 to update Storj-Exporter-Boom-Table-combined.
Sadly can't update due to illness. Feel free to change
@kevinkk525 hope you get better soon!
I'll look to update the combined dashboard when I get some time. Created https://github.com/anclrii/Storj-Exporter-dashboard/issues/22.
Can you try if this one works https://github.com/anclrii/Storj-Exporter-dashboard/blob/master/Storj-Exporter-Boom-Table.json?
There were some deprecated metrics removed in favour of labels in v2 and https://grafana.com/grafana/dashboards/13896 is probably still using the old ones. We can ask @kevinkk525 to update Storj-Exporter-Boom-Table-combined.
Nope, that one is completly broken... => Templating [node] Error updating options: e.replace is not a function
@blaatblaat What version of grafana are you using? It works fine on v8.5.5
.
I'm running Grafana v8.5.5
.
Looking through Prometheus, it seems that a bunch of metrics were depreciated in 2.0, which my dashboards were reliant upon, such as lastPinged_info
, conversion of storj_nodeID_info
to storj_node_info
, and some other enhancements. I had alerts set up on some of these metrics, and they paged out as soon as the docker updated.
I understand development can make product changes like this, but it would be nice to know what exactly changed (or have some sort of migration documentation for variable to variable).
For now, I've gotten my dashboard up and working, (and confirmed that the .json file provided in 2.x does work too), but I'm now searching for a solution for the removal of lastPinged_info
. Essentially, I chart out how many nodes have checked in, and if the number drops below [My Node Count] for 5 minutes, send me a text message. Any ideas?
@TheChrisTech yeah not sure how to communicate deprecating metrics better. I kept them until next major release for a while.
The change is that metrics like storj_[type]_info are now under a single storj_node_info
with type being a label.
Also removal of lastPinged
metric was specifically requested in https://github.com/anclrii/Storj-Exporter/issues/56.
how many nodes have checked in
I'm not sure what this means but it sounds like you could just use something like count(storj_node_info{type="nodeID"})
to get count for live nodes. Did you rely on values in lastPinged? If you need the values I can bring it back.
Sir!
We're back on track! Seems indeed that the Grafana update solved most issues. Reworking the dashboard right now, but it seems that the Storj-Exporter is working again. Thx for pointing out to update Grafana (Y)
@blaatblaat I'm glad it works.
If you are updating https://github.com/anclrii/Storj-Exporter-dashboard/blob/527f9946e625dad9ba0864e90b5a05a2f296145b/alternatives/dashboard_exporter_combined.json dashboard would be great if you can raise a PR.
@anclrii - Thanks for the query count(storj_node_info{type="nodeID"})
... worked beautifully.
I'll close this issue out since it seems we're all good here. Appreciate the assistance!
My docker automatically updated to 2.0.0 last night, and resulted in no data being sent to Prometheus. Downgrading to 1.0.13 resolves the issue. @anclrii - Please let me know if there's information I can gather for you to help troubleshoot this. I have 2 nodes that are still running 2.0.0 in case.