munin-monitoring / munin

Main repository for munin master / node / plugins
http://munin-monitoring.org
Other
1.99k stars 474 forks source link

Counter graph from munin is scaled by a factor of about 3 #1349

Closed AloisKlingler closed 4 years ago

AloisKlingler commented 4 years ago

Hi,

with munin I am trying to monitor my fronius smartmeter. It records absolute values so I wrote a plugin with COUNTER as types.

Now the issue, the graph is scaled by a factor and I do not understand why.

To verify that the issue is located anywhere in my munin plugin, I am writing the same values parallel to a sqlite database which I am graphing. there I cannot see the unknown factor.

The multiplyer by 12 is caused by every 5 Minutes on munin-cron. I think this is the right way? Nevertheless, the graphing of the sqlite written values are also by 12 and they match the expected output.

Attached is the munin-plugin, also a screenshot from the munin-graph compared to the sqlite-graph.

Can somebody help?

Thanks!

Alois

munin-graph: image

sqlite-graph: image

munin-plugin: fronius_avg.txt

the rrd-files: munin-rrd.zip

the sqlite-db with the same values in: sqlite-db.zip

the quick&dirty graphing of the sqlite-db: pvstatistics_daily.txt

niclan commented 4 years ago

These two graphs look exactly the same, they show the same thing and would appear to be correct. The way you multiply by 12 is the right way in munin.

I'm not sure I understand what the problem is but here is my best guess:

The readings under the munin graph shows the numerical values (as read and then multiplied by 12).

Some of them are smaller than 1 and are scaled to milli according to the SI units, the m means milli or thousanth according to this table: https://en.wikipedia.org/wiki/International_System_of_Units#Prefixes

"479.52m" means 479.52/1000 = 0.47952.

If the numbers grow larger they will be scaled to kilo (K, thousands) and mega and so on.

AloisKlingler commented 4 years ago

hello @niclan ,

thanks for your response. the issue is the scaling on the y axis. in the munin-graph the peak is at 13 - I would expect it at 40. compared to the sqlite graph where it is at 4000.

I already calcuated a factor, instead of 12 I would need to multiply by 37.5 to get the same results. and I don't know why.

Thank you!

Alois

niclan commented 4 years ago

Basically I have over many years learnt to trust RRD, it's calculations are correct. If the interface you collect data from reports data in a unscaled unit then there should be no reason to multiply it with 12 or 37.5.

I'm afraid I can't spare the time to understand everything that goes on in the plugin and in your plotting code.

  $invertervalues1 = fetchjsonfromurl("http://192.168.0.249/solar_api/v1/GetPowerFlowRealtimeData.fcgi");
  $invertervalues2 = fetchjsonfromurl("http://192.168.0.249/solar_api/v1/GetMeterRealtimeData.cgi?Scope=Device&DeviceId=0");
  $dailyvalues["e_total"] = $invertervalues1["Body"]["Data"]["Site"]["E_Total"];
  $dailyvalues["e_importtotal"] = $invertervalues2["Body"]["Data"]["EnergyReal_WAC_Plus_Absolute"];
  $dailyvalues["e_exporttotal"] = $invertervalues2["Body"]["Data"]["EnergyReal_WAC_Minus_Absolute"];

If I understand this right the reading from the CGIs is a cumulative number? If that's right then COUNTER is right, but we have over time learned to prefer DERIVED with a min setting of 0 to avoid counter reset spikes. But this will not help with the scaling.

I'm not familiar with inverters so I'm on very thin ice about how they work or what kind of units they use.

To explore the readings you get I would put the raw numbers into a CSV file and then after a day or two import it in a spreadsheet program and play around with it and plot it there and then re-read the manual for the electronic interface for the inverter. Or something like that.

sumpfralle commented 4 years ago

It records absolute values so I wrote a plugin with COUNTER as types.

I assume, that it emits the amount of energy used up to this moment (total consumption). Thus it is not the amount of energy used since the last request (incremental consumption). Correct?

What is the unit of this value? Joule or watt-hours?

I have the feeling, that you do not need to apply a factor based on the time. But maybe I misunderstand your input data.

AloisKlingler commented 4 years ago

hello,

sorry for the late reply. @niclan yes, the number from the inverter's CGI is a cumulative number. your suggestin with the raw values, I did exactly this but instead of putting the raw numbers in a CSV I put it into the sqlite database. the plotting is not with a spreadsheet, but with the sqlite database and chart.js.

@sumpfralle : the data I fetch (with the lines niclan posted above are: invertervalues1 ... ["E_Total"] (total energy what the inverter produced. (e_total)) invertervalues2 ... ["EnergyReal_WAC_Plus_Absolute"] (total energy imported from the public grid (e_importtotal)) invertervalues2 ... ["EnergyReal_WAC_Minus_Absolute"] (total energy exported from the public grid (e_exporttotal))

The unit for all three values: Wh (Watt-hours)

two sample from the sqlite-database:

sample 1:

date      |e_total |e_importtotal  |e_exporttotal |H_kWh_Used |WW_kWh_Used
     10/07/2020 @ 12:30am UTC (+2h -> 2:30am local time)
1602030618|20757450|8340480        |11878902      |782130     |1099360
1602030917|20757450|8340501        |11878902      |782130     |1099360
diff   299|       0|     21        |       0      |     0     |      0

sample 2:

date      |e_total |e_importtotal  |e_exporttotal |H_kWh_Used |WW_kWh_Used
     10/07/2020 @ 9:35am (UTC) (+2h -> 11:35am local time)
1602063318|20761270|8342986        |11880350      |782130     |1100480
1602063620|20761550|8342986        |11880487      |782130     |1100480
diff   302|     280|      0        |     137      |     0     |      0

sample 1: during 5 minutes there have been imported from public grid 21Wh sample 2: during 5 minutes there have been generated 280Wh and exported to public grid 137 Wh

Why I am multiplying with 12: The inverter has another interface (also CGI) which is much "consumer-friendly". There the values are in Watts - so, you can see it is generating currently e.g. 2000 Watts, you can use them this value much better on thinking what devices you start (like Dishwasher, Washing Machine, or even the Heatpump for Heating and warm water). But as it is Watts it is not nice in statistics, as most devices pulse power, or the sun gets covered by a cloud, and as munin collects every 5 Minutes, you only get the snapshot of the time where you are requesting. I want to have the average over 5 Minutes in the diagram, therefore I fetch the inverter totals (invertervalues1, e_total) and smartmeter totals (invertervalues2, e_importtotal and e_exporttotal) and - to display the same as the 5-minute-snapshots, I need to calculate this by 12.

so, for the samples: sample 1: during 5 Minutes 21Wh, multiplied by 12 is 252 (and this is what the snapshot is displaying if the devices which are consuming are not pulsing but have a steady power usage) sample 2: 280Wh x 12 = 3360 generated by the inverter, 137 x 12 = 1644 exported to grid

and here the statistics comparison - I am only referring to the green statistics (e_total):

the munin-5-minute snapshot statistics: the 5 minutes from 11:35am to 11:40am is at about 3600W produced (see y-axis). image

the munin-5-minute average statistics (the graph multiplier is set as in the first post with 12): the average for the 5 minutes from 11:35am to 11:40am is at 11 on the y-axis) image

the sqlite-5-minute average statistics (also the multiplier is set in the php-code as in the first post to 12 - line 46; I have cut the screenshot in x-axis to make it smaller): the average for the 5 minutes from 11:35am to 11:40am is at 3360 on the y-axis) image

I do not know why the munin average graph is wrong.

if I use in munin as multiplier "37.5" for the average-graph it looks "correct": image

the factor I have calculate by comparing with the snapshot-graph and trying with the multiplier - so it is like rolling a dice. I do not know why the multiplier by 12 produces a graph which I do not expect.

Thank you for your help and time, again. :-)

Best regards Alois

sumpfralle commented 4 years ago

I want to have the average over 5 Minutes in the diagram, therefore I fetch the inverter totals (invertervalues1, e_total) and smartmeter totals (invertervalues2, e_importtotal and e_exporttotal) and - to display the same as the 5-minute-snapshots, I need to calculate this by 12.

I have the feeling, you are doing trying to help munin too much here. From my point of view, you can just feed the cumulative total (as received form the device) directly to munin and it will turn this into a change / time value (Watt-hours / second). Isn't this the value you are looking for?

Summary: use GAUGE if you feed current values and use DERIVE if you feed cumulative values into munin. Assuming a timebase (e.g. 5 minutes) should never be necessary.

Or do I misunderstand something?

AloisKlingler commented 4 years ago

ok, I have changed from COUNTER to GAUGE.

If I remove the multiplier the graph is at ~0.92 image

this fits to your Watt-hours / second (so 280 / 300 = 0,93)

now it makes sense: this I now need to multiply by 3600 (or to have it scaled by 100 multiply by 36) to get Watt-hours / hour which I want. :-) image

Thank you, my misunderstanding was the change/time calculation which is done already and needs not be done manually. Only unitconversion needs to be done. :-)

Thanks. :-)

Best regards Alois