ptarmiganlabs / butler-sos

Butler SenseOps Stats ("Butler-SOS") is a microservice publishing operational Qlik Sense metrics to InfluxDB, Prometheus and New Relic. Add Grafana for great looking dashboards and you get real-time monitoring of what happens inside a Qlik Sense environment.
https://butler-sos.ptarmiganlabs.com
MIT License
27 stars 13 forks source link

Error: uint value for field 'bytes_added' out of range #933

Closed MarioShuuya closed 3 weeks ago

MarioShuuya commented 4 weeks ago

What version of Butler SOS are you using?

10.2.1

What version of Node.js are you using? Not applicable if you use the standalone version of Butler SOS.

v20.17.0

What command did you use to start Butler SOS?

node src/butler-sos.js --configfile /nodeapp/config/prod.yaml

What operating system are you using?

Debian GNU/Linux bullseye

What CPU architecture are you using?

x64

What Qlik Sense versions are you using?

Qlik Sense February 2024 Patch 4 - 14.173.8

Describe the Bug

Hello, we have ran into a problem, where our Butler SOS deployment randomly fails because of the follwoing error. As this is now the second time it has happened, we would want to inquire about, if this is a bug.

During this behaviour, neither the Qlik environment nor the environment on which Butler SOS runs on, showed any signs of problems so we can not explain where the value comes from.

Once Butler SOS is redeployed, it works like normal.

Error: uint value for field 'bytes_added' out of range: -2068853448
    at Fe.uintField (file:///nodeapp/node_modules/@influxdata/influxdb-client/dist/index.mjs:4:2852)
    at postHealthMetricsToInfluxdb (file:///nodeapp/src/lib/post-to-influxdb.js:471:18)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
Node.js v20.17.0

Attached Logfile: butler-sos-prod-765585cdc6-4ffln.txt

Expected Behavior

When there is a value, that does not conform with the defined data base column types, a warning should be displayed instead of an error. This way Butler SOS can continue working like normal, as most times the next value is a normal value.

To Reproduce

No response

mountaindude commented 4 weeks ago

Thanks for reporting this.

I have never seen anything like it, to be honest.. Very interesting! I will certainly take a look and I agree, looks like better checking of the data would be good, before writing it to InfluxDB. Could be that bytes_added should be some other data type in InfluxDB.

One question: How much RAM does your Sense server(s) have?

mountaindude commented 4 weeks ago

Another question: Are you using InfluxDB 1 or 2? I suspect 2... if that is indeed the case I have found a possible cause of the bug you are seeing.

Qlik Sense may return a bytes_added value that is either positive or negative. But Butler SOS assumed a positive number is returned from Sense (based on the word "added" in the field name). That is most likely incorrect though, both positive and negative values should be allowed.

I will make this change, next version of Butler SOS will allow bytes_added to be either positive or negative for InfluxDB 2.

Butler SOS' code for InfluxDB v1 already supports positive and negative numbers.

MarioShuuya commented 3 weeks ago

Thank you for the quick and positive response to our bug report. Allowing positive and negative numbers sounds like a good idea, even more so if its about the change in bytes in the cache.

One question: How much RAM does your Sense server(s) have?

Currently the maximum we have on a single node is 240GB RAM. In case the total is relevant too, that would be 624GB RAM.

Another question: Are you using InfluxDB 1 or 2?

Your suspicion was correct, we use InfluxDB 2.

mountaindude commented 3 weeks ago

A new version (11.0.3) of Butler SOS is building right now, should be done within the hour. It includes a fix for the issue discussed here.