anclrii / Storj-Exporter

Prometheus exporter for monitoring Storj storage nodes
GNU General Public License v3.0
58 stars 19 forks source link

Suspended node - ValueError for storj_sat_summary value #31

Closed fmoledina closed 4 years ago

fmoledina commented 4 years ago

One of my nodes has been suspended on all satellites (I'm hoping that I've fixed this issue). While it's in this state, I've noticed an error show up in my storj-exporter logs, preventing Prometheus from picking up the metrics. It appears that when a node is functioning normally, storj_sat_summary{type="suspended"} = 0. However, when the node is suspended (any maybe disqualified?), this value is equal to the timestamp when the suspension occurred. This results in a Python ValueError: could not convert string to float. The full log entry is below:

storj7-exporter    | 2020-06-11T15:26:15.093765748Z ValueError: ("could not convert string to float: '2020-06-10T09:42:09.230317Z'", Metric(storj_sat_summary, Storj satellite summary metrics, gauge, , [Sample(name='storj_sat_summary', labels={'type': 'storageSummary', 'satellite': '118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW', 'url': 'satellite.stefan-benten.de:7777'}, value=10733808116681.848, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'bandwidthSummary', 'satellite': '118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW', 'url': 'satellite.stefan-benten.de:7777'}, value=9055527680, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'disqualified', 'satellite': '118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW', 'url': 'satellite.stefan-benten.de:7777'}, value=0, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'suspended', 'satellite': '118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW', 'url': 'satellite.stefan-benten.de:7777'}, value='2020-06-10T09:42:09.230317Z', timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'storageSummary', 'satellite': '1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE', 'url': 'saltlake.tardigrade.io:7777'}, value=310756362212332.56, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'bandwidthSummary', 'satellite': '1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE', 'url': 'saltlake.tardigrade.io:7777'}, value=45506304000, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'disqualified', 'satellite': '1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE', 'url': 'saltlake.tardigrade.io:7777'}, value=0, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'suspended', 'satellite': '1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE', 'url': 'saltlake.tardigrade.io:7777'}, value='2020-06-10T11:04:36.588912Z', timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'storageSummary', 'satellite': '121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6', 'url': 'asia-east-1.tardigrade.io:7777'}, value=13511116560573.166, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'bandwidthSummary', 'satellite': '121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6', 'url': 'asia-east-1.tardigrade.io:7777'}, value=36938606080, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'disqualified', 'satellite': '121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6', 'url': 'asia-east-1.tardigrade.io:7777'}, value=0, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'suspended', 'satellite': '121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6', 'url': 'asia-east-1.tardigrade.io:7777'}, value='2020-06-10T13:56:08.978757Z', timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'storageSummary', 'satellite': '12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S', 'url': 'us-central-1.tardigrade.io:7777'}, value=30709809210618.84, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'bandwidthSummary', 'satellite': '12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S', 'url': 'us-central-1.tardigrade.io:7777'}, value=41315643904, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'disqualified', 'satellite': '12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S', 'url': 'us-central-1.tardigrade.io:7777'}, value=0, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'suspended', 'satellite': '12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S', 'url': 'us-central-1.tardigrade.io:7777'}, value='2020-06-10T12:03:00.89431Z', timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'storageSummary', 'satellite': '12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs', 'url': 'europe-west-1.tardigrade.io:7777'}, value=21041736952952.41, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'bandwidthSummary', 'satellite': '12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs', 'url': 'europe-west-1.tardigrade.io:7777'}, value=170009743104, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'disqualified', 'satellite': '12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs', 'url': 'europe-west-1.tardigrade.io:7777'}, value=0, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'suspended', 'satellite': '12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs', 'url': 'europe-west-1.tardigrade.io:7777'}, value='2020-06-10T13:26:31.648521Z', timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'storageSummary', 'satellite': '12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB', 'url': 'europe-north-1.tardigrade.io:7777'}, value=295882554217466.2, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'bandwidthSummary', 'satellite': '12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB', 'url': 'europe-north-1.tardigrade.io:7777'}, value=751341499904, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'disqualified', 'satellite': '12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB', 'url': 'europe-north-1.tardigrade.io:7777'}, value=0, timestamp=None, exemplar=None), Sample(name='storj_sat_summary', labels={'type': 'suspended', 'satellite': '12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB', 'url': 'europe-north-1.tardigrade.io:7777'}, value='2020-06-10T10:47:11.6619Z', timestamp=None, exemplar=None)]))
anclrii commented 4 years ago

Cool thanks for reporting this. I was wondering what the value is going to be when it's not 0 and was hoping for 1 :) . Prometheus can't have date as a value so I'll force value to 1 if api returns anything other then 0 then.

I'll update the exporter soon to account for this.