tedpearson / electric-usage-downloader

Import smart meter metrics from smarthub into VictoriaMetrics or InfluxDB
MIT License
24 stars 1 forks source link

Data gets duplicated 60x in influxdb? #11

Open doctarrwho opened 1 month ago

doctarrwho commented 1 month ago

After importing the data, I noticed that the raw data included the same values for each minute of the hour. See screenshot: image

The data returned from SmartHub (for me) seems to be hourly, see sample debug output:

DEBUG: Parsed data from poll endpoint:
&{Status:COMPLETE Data:map[ELECTRIC:[{Type:USAGE Series:[{Name:336288847 Data:[{UnixMillis:1719532800000 Value:2.26} {UnixMillis:1719536400000 Value:2.24} {UnixMillis:1719540000000 Value:0.97} 

I modified the local code to change the WriteMetrics function to write one point per hour which fixed the problem for me, and now I have single row for each hour with hourly value.

func WriteMetrics(records []ElectricUsage, config InfluxConfig) error {
    opts := influxdb2.DefaultOptions()
    if config.Insecure {
        opts.SetTLSConfig(&tls.Config{InsecureSkipVerify: true})
    }
    client := influxdb2.NewClientWithOptions(config.Host, config.AuthToken, opts)
    writeApi := client.WriteAPIBlocking(config.Org, config.Database)
    for _, record := range records {
        // Calculate the duration in hours, assuming the record spans an hour
        hours := record.EndTime.Sub(record.StartTime).Hours()
        if hours == 0 {
            hours = 1 // Avoid division by zero, assume at least one hour
        }

        // Create a point for the record
        point := influxdb2.NewPointWithMeasurement("electric").
            SetTime(record.StartTime).
            AddField("watts", float64(record.WattHours)/hours)

        if record.CostInCents != nil {
            point.AddField("cost", float64(*record.CostInCents)/hours)
        }
        if record.MeterName != nil {
            point.AddTag("name", *record.MeterName)
        }

        // Write the point to the database
        err := writeApi.WritePoint(context.Background(), point)
        if err != nil {
            return err
        }
    }
    return nil
}

I'm not sure if my utility provider returns different data vs. yours, but some customization here would be nice to control the point interval.

tedpearson commented 1 month ago

Let me try to explain why the code does what it does here in the WriteMetrics function.

If you have a strong reason why hourly datapoints work better for your use case than minutely datapoints, I could add some sort of config option to toggle this feature.

image
doctarrwho commented 1 month ago

Minutely metric would make sense if that's what the data was for, but in my case, the hourly metric is returned as kWh and is duplicated 60x (once for each minute) so the end result in the graphs is way off.

image
Data:[
  {UnixMillis:1719532800000 Value:2.26} 
  {UnixMillis:1719536400000 Value:2.24}

In this data sample, 2.26 and 2.24 are saved 60x each, so if I were to look at the cumulative usage, it shows 270kWh instead of 4.5kWh for 2 hours.

Ideally, the points written to influxdb should be based on the interval in the data returned by the utility provider, whether it's an hour (my case), or 15 mins (your case), or anything else.

tedpearson commented 2 weeks ago

Yes, the returned data from smarthub is in kilowatt-hour units. However, currently electric-usage-downloader writes a metric in watts ("realtime" usage), not watt-hours or kilowatt-hours ("cumulative" usage). I wrote it that way because my co-op returns 15-minutely data, and I wanted to see the wattage my home was using, not how many watt-hours were used each 15 minute period.

So if I were to allow for customization, it would be in the unit returned. I would want to have a metric for both watt-hours and for watts, with separate metrics or maybe labels.

But technically speaking, since the unit is watts and not watt-hours, the metric is not duplicated, it's just averaged since there is only hourly data to extrapolate from. Since the metric is in watts, the equation to accumulate it in prometheus or victoriametrics would be sum_over_time(watts[$internal])/60 to get kilowatt-hours. I haven't touched influxql in a while so I'm unsure of the equivalent influx query.