anclrii / Storj-Exporter

Prometheus exporter for monitoring Storj storage nodes
GNU General Public License v3.0
58 stars 19 forks source link

2.0.0 - No Data #64

Closed TheChrisTech closed 2 years ago

TheChrisTech commented 2 years ago

My docker automatically updated to 2.0.0 last night, and resulted in no data being sent to Prometheus. Downgrading to 1.0.13 resolves the issue. @anclrii - Please let me know if there's information I can gather for you to help troubleshoot this. I have 2 nodes that are still running 2.0.0 in case.

anclrii commented 2 years ago

Weird, I'm not seeing this with my nodes. If you can share docker logs storj-exporter that might help. It may be a platform issue, I only test on amd64, what hardware are you running on?

ghost commented 2 years ago

Yup, same here. All 84 nodes not logging to Prometheus anymore. 81x amd64 image and 3x ARMv7... Previously versions worked good... since 2.x the whole exporting is crippled... Used dashboard => https://grafana.com/grafana/dashboards/13896

Please roll-back the 2.x release and start this 2x release in beta

docker logs storj-exporter => No logging available...

Output of CURL from Storj-Exporter 2.x -crippled- release; https://pastebin.com/6LssrtPf

Output of CURL from Storj-Exporter previously -working- release; https://pastebin.com/VUWGxr6f

anclrii commented 2 years ago

Can you try if this one works https://github.com/anclrii/Storj-Exporter-dashboard/blob/master/Storj-Exporter-Boom-Table.json?

There were some deprecated metrics removed in favour of labels in v2 and https://grafana.com/grafana/dashboards/13896 is probably still using the old ones. We can ask @kevinkk525 to update Storj-Exporter-Boom-Table-combined.

kevinkk525 commented 2 years ago

Sadly can't update due to illness. Feel free to change

anclrii commented 2 years ago

@kevinkk525 hope you get better soon!

I'll look to update the combined dashboard when I get some time. Created https://github.com/anclrii/Storj-Exporter-dashboard/issues/22.

ghost commented 2 years ago

Can you try if this one works https://github.com/anclrii/Storj-Exporter-dashboard/blob/master/Storj-Exporter-Boom-Table.json?

There were some deprecated metrics removed in favour of labels in v2 and https://grafana.com/grafana/dashboards/13896 is probably still using the old ones. We can ask @kevinkk525 to update Storj-Exporter-Boom-Table-combined.

Nope, that one is completly broken... => Templating [node] Error updating options: e.replace is not a function

anclrii commented 2 years ago

@blaatblaat What version of grafana are you using? It works fine on v8.5.5.

TheChrisTech commented 2 years ago

I'm running Grafana v8.5.5. Looking through Prometheus, it seems that a bunch of metrics were depreciated in 2.0, which my dashboards were reliant upon, such as lastPinged_info, conversion of storj_nodeID_info to storj_node_info, and some other enhancements. I had alerts set up on some of these metrics, and they paged out as soon as the docker updated.

I understand development can make product changes like this, but it would be nice to know what exactly changed (or have some sort of migration documentation for variable to variable).

For now, I've gotten my dashboard up and working, (and confirmed that the .json file provided in 2.x does work too), but I'm now searching for a solution for the removal of lastPinged_info. Essentially, I chart out how many nodes have checked in, and if the number drops below [My Node Count] for 5 minutes, send me a text message. Any ideas?

anclrii commented 2 years ago

@TheChrisTech yeah not sure how to communicate deprecating metrics better. I kept them until next major release for a while.

The change is that metrics like storj_[type]_info are now under a single storj_node_info with type being a label.

Also removal of lastPinged metric was specifically requested in https://github.com/anclrii/Storj-Exporter/issues/56.

how many nodes have checked in

I'm not sure what this means but it sounds like you could just use something like count(storj_node_info{type="nodeID"}) to get count for live nodes. Did you rely on values in lastPinged? If you need the values I can bring it back.

ghost commented 2 years ago

Sir!

We're back on track! Seems indeed that the Grafana update solved most issues. Reworking the dashboard right now, but it seems that the Storj-Exporter is working again. Thx for pointing out to update Grafana (Y)

anclrii commented 2 years ago

@blaatblaat I'm glad it works.

If you are updating https://github.com/anclrii/Storj-Exporter-dashboard/blob/527f9946e625dad9ba0864e90b5a05a2f296145b/alternatives/dashboard_exporter_combined.json dashboard would be great if you can raise a PR.

TheChrisTech commented 2 years ago

@anclrii - Thanks for the query count(storj_node_info{type="nodeID"}) ... worked beautifully.

I'll close this issue out since it seems we're all good here. Appreciate the assistance!