feat: Dashboard refinements and standardization

swiffer commented 1 week ago

Hi @JakobLichterfeld - that one became bigger than expected. I hope the effort is worth it and the changes are appreciated.

I tried to ensure all Metrics are self-explanatory and naming is consistent across all Dashboards. I'll add some comments making it easier to review now.

fixes #4350 fixes #4268

netlify[bot] commented 1 week ago

Deploy Preview for teslamate ready!

Name	Link
Latest commit	b1eab6178ecb4df5ef2c4c88a55a2d551d4f9c5c
Latest deploy log	https://app.netlify.com/sites/teslamate/deploys/673ae9a64af06c0008fad484
Deploy Preview	https://deploy-preview-4367--teslamate.netlify.app
Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

sdwalker commented 1 week ago

"Number of" or "# of"?

grep "# of" .json battery-health.json: "description": "\"# of Charging cycles\" is estimated by dividing the whole energy added to the battery by the battery capacity when new.\n\n\"Charging Efficiency\" is estimated on the difference between energy used from the charger and energy added to the battery.", battery-health.json: "rawSql": "SELECT\r\n\tCOUNT() AS \"# of Charges\"\r\nFROM\r\n\tcharging_processes\r\nWHERE\r\n\tcar_id = $car_id AND charge_energy_added > 0.01\r\n\t", battery-health.json: "refId": "# of Charges", battery-health.json: "rawSql": "SELECT\n\tfloor(sum(charge_energy_added) / CASE WHEN $custom_kwh_new > 0 THEN $custom_kwh_new ELSE ('$aux'::json ->> 'MaxCapacity')::float END) AS \"# of Charging cycles\"\nFROM charging_processes WHERE car_id = $car_id AND charge_energy_added > 0.01", battery-health.json: "refId": "# of Charging cycles", charging-stats.json: "value": "# of Charges" charging-stats.json: "value": "# of Discharges" locations.json: "rawSql": "SELECT\n\tcity,\n\tcount() as \"# of Addresses\"\nFROM\n\taddresses\nWHERE\n\tcity IS NOT NULL\nGROUP BY\n\t1\nORDER BY\n\t2 DESC\nLIMIT 10;", locations.json: "rawSql": "SELECT\n\tstate,\n\tcount() as \"# of Addresses\"\nFROM\n\taddresses\nWHERE\n\tstate IS NOT NULL\nGROUP BY\n\t1\nORDER BY\n\t2 DESC\nLIMIT 10;", updates.json: "value": "# of Charges" updates.json: "alias": "# of Charges",

$ grep "Number of" *.json charging-stats.json: "title": "Number of Charges", drive-stats.json: "title": "Number of Drives", locations.json: "title": "Number of Addresses", locations.json: "title": "Number of Cities", locations.json: "title": "Number of States", locations.json: "title": "Number of Countries",

swiffer commented 1 week ago

"Number of" or "# of"?

grep "# of" .json battery-health.json: "description": ""# of Charging cycles" is estimated by dividing the whole energy added to the battery by the battery capacity when new.\n\n"Charging Efficiency" is estimated on the difference between energy used from the charger and energy added to the battery.", battery-health.json: "rawSql": "SELECT\r\n\tCOUNT() AS "# of Charges"\r\nFROM\r\n\tcharging_processes\r\nWHERE\r\n\tcar_id = $car_id AND charge_energy_added > 0.01\r\n\t", battery-health.json: "refId": "# of Charges", battery-health.json: "rawSql": "SELECT\n\tfloor(sum(charge_energy_added) / CASE WHEN $custom_kwh_new > 0 THEN $custom_kwh_new ELSE ('$aux'::json ->> 'MaxCapacity')::float END) AS "# of Charging cycles"\nFROM charging_processes WHERE car_id = $car_id AND charge_energyadded > 0.01", battery-health.json: "refId": "# of Charging cycles", charging-stats.json: "value": "# of Charges" charging-stats.json: "value": "# of Discharges" locations.json: "rawSql": "SELECT\n\tcity,\n\tcount() as "# of Addresses"\nFROM\n\taddresses\nWHERE\n\tcity IS NOT NULL\nGROUP BY\n\t1\nORDER BY\n\t2 DESC\nLIMIT 10;", locations.json: "rawSql": "SELECT\n\tstate,\n\tcount(_) as "# of Addresses"\nFROM\n\taddresses\nWHERE\n\tstate IS NOT NULL\nGROUP BY\n\t1\nORDER BY\n\t2 DESC\nLIMIT 10;", updates.json: "value": "# of Charges" updates.json: "alias": "# of Charges",

$ grep "Number of" *.json charging-stats.json: "title": "Number of Charges", drive-stats.json: "title": "Number of Drives", locations.json: "title": "Number of Addresses", locations.json: "title": "Number of Cities", locations.json: "title": "Number of States", locations.json: "title": "Number of Countries",

For now both - depends of available Space in Tables and Panels. What would you prefer?

JakobLichterfeld commented 1 week ago

Hi @JakobLichterfeld - that one became bigger than expected. I hope the effort is worth it and the changes are appreciated.

I tried to ensure all Metrics are self-explanatory and naming is consistent across all Dashboards. I'll add some comments making it easier to review now.

Wow! Thank you so much, I love the quality of life improvements and standardization. Its much appreciated! I will look into it step by step the comming days.

Thanks @sdwalker for always checking miles related stuff as well!

sdwalker commented 1 week ago

Expected gross changes with e69c0cf?

Before

After

sdwalker commented 1 week ago

"Number of" or "# of"? grep "# of" .json battery-health.json: "description": ""# of Charging cycles" is estimated by dividing the whole energy added to the battery by the battery capacity when new.\n\n"Charging Efficiency" is estimated on the difference between energy used from the charger and energy added to the battery.", battery-health.json: "rawSql": "SELECT\r\n\tCOUNT() AS "# of Charges"\r\nFROM\r\n\tcharging_processes\r\nWHERE\r\n\tcar_id = $car_id AND charge_energy_added > 0.01\r\n\t", battery-health.json: "refId": "# of Charges", battery-health.json: "rawSql": "SELECT\n\tfloor(sum(charge_energy_added) / CASE WHEN $custom_kwh_new > 0 THEN $custom_kwh_new ELSE ('$aux'::json ->> 'MaxCapacity')::float END) AS "# of Charging cycles"\nFROM charging_processes WHERE car_id = $car_id AND charge_energyadded > 0.01", battery-health.json: "refId": "# of Charging cycles", charging-stats.json: "value": "# of Charges" charging-stats.json: "value": "# of Discharges" locations.json: "rawSql": "SELECT\n\tcity,\n\tcount() as "# of Addresses"\nFROM\n\taddresses\nWHERE\n\tcity IS NOT NULL\nGROUP BY\n\t1\nORDER BY\n\t2 DESC\nLIMIT 10;", locations.json: "rawSql": "SELECT\n\tstate,\n\tcount(_) as "# of Addresses"\nFROM\n\taddresses\nWHERE\n\tstate IS NOT NULL\nGROUP BY\n\t1\nORDER BY\n\t2 DESC\nLIMIT 10;", updates.json: "value": "# of Charges" updates.json: "alias": "# of Charges", $ grep "Number of" *.json charging-stats.json: "title": "Number of Charges", drive-stats.json: "title": "Number of Drives", locations.json: "title": "Number of Addresses", locations.json: "title": "Number of Cities", locations.json: "title": "Number of States", locations.json: "title": "Number of Countries",

For now both - depends of available Space in Tables and Panels. What would you prefer?

Number of would be my preference

3rd style statistics.json: "options": "# Charges" statistics.json: "options": "# Drives" statistics.json: "cnt": "# Drives", statistics.json: "cnt_charges": "# Charges",

Get rid of them all and just use Charging cycles, Charges, Discharges, Drives?

swiffer commented 1 week ago

Expected gross changes with e69c0cf?

Before

After

Yes - I hoped to have changed it for better. With the data available in my instance the updated queries show 1:1 the same info as shown in the Efficiency Dasboard now if looking at "year" periods in Statistics.

While the criteria used previously excluded all Data Points with a negative range_loss (charging or regenerative breaking) the updated approach makes use of all Data Points in Positions except the ones in Periods of Charging (so now including negative range caused by regenerative breaking). Are you driving up/downhill "a lot" ? That would explain it.

Where are the net/gross screenshots from? I wonder how gross became smaller than net - that might needs some data digging. Do you have any non-closed drives / charges or was a charging process active at the time you took the screenshots?

swiffer commented 1 week ago

"Number of" or "# of"? grep "# of" .json battery-health.json: "description": ""# of Charging cycles" is estimated by dividing the whole energy added to the battery by the battery capacity when new.\n\n"Charging Efficiency" is estimated on the difference between energy used from the charger and energy added to the battery.", battery-health.json: "rawSql": "SELECT\r\n\tCOUNT() AS "# of Charges"\r\nFROM\r\n\tcharging_processes\r\nWHERE\r\n\tcar_id = $car_id AND charge_energy_added > 0.01\r\n\t", battery-health.json: "refId": "# of Charges", battery-health.json: "rawSql": "SELECT\n\tfloor(sum(charge_energy_added) / CASE WHEN $custom_kwh_new > 0 THEN $custom_kwh_new ELSE ('$aux'::json ->> 'MaxCapacity')::float END) AS "# of Charging cycles"\nFROM charging_processes WHERE car_id = $car_id AND charge_energyadded > 0.01", battery-health.json: "refId": "# of Charging cycles", charging-stats.json: "value": "# of Charges" charging-stats.json: "value": "# of Discharges" locations.json: "rawSql": "SELECT\n\tcity,\n\tcount() as "# of Addresses"\nFROM\n\taddresses\nWHERE\n\tcity IS NOT NULL\nGROUP BY\n\t1\nORDER BY\n\t2 DESC\nLIMIT 10;", locations.json: "rawSql": "SELECT\n\tstate,\n\tcount(_) as "# of Addresses"\nFROM\n\taddresses\nWHERE\n\tstate IS NOT NULL\nGROUP BY\n\t1\nORDER BY\n\t2 DESC\nLIMIT 10;", updates.json: "value": "# of Charges" updates.json: "alias": "# of Charges", $ grep "Number of" *.json charging-stats.json: "title": "Number of Charges", drive-stats.json: "title": "Number of Drives", locations.json: "title": "Number of Addresses", locations.json: "title": "Number of Cities", locations.json: "title": "Number of States", locations.json: "title": "Number of Countries",

For now both - depends of available Space in Tables and Panels. What would you prefer?

Number of would be my preference

3rd style statistics.json: "options": "# Charges" statistics.json: "options": "# Drives" statistics.json: "cnt": "# Drives", statistics.json: "cnt_charges": "# Charges",

Get rid of them all and just use Charging cycles, Charges, Discharges, Drives?

ok with it as well - let's get a 3rd opinion from @JakobLichterfeld / @DrMichael, will update the Labels through the dashboards then! So for the two of you: When having a count of X, should we name that as "Number of X", "# of X" or just "X" (and should we have a short and a long form or use a short form of the label no matter how much space is available)?

JakobLichterfeld commented 1 week ago

ok with it as well - let's get a 3rd opinion from @JakobLichterfeld / @DrMichael, will update the Labels through the dashboards then! So for the two of you: When having a count of X, should we name that as "Number of X", "# of X" or just "X" (and should we have a short and a long form or use a short form of the label no matter how much space is available)?

From a UX perspective, I favor option number 1: ‘Number of x’. Considering that most of the devices used nowadays are mobile devices, we should also develop mobile first. I can foresee problems with a long description for a small column, and would therefore vote for ‘No. of x’.

DrMichael commented 1 week ago

I definitely would prefer a short column header. And actually just calling it "Drives" instead of "# of drives" or "No. of drives" seems just to be sufficient.

sdwalker commented 1 week ago

Doubtful there's a way to set the decimal separator depending on the locale?

While the criteria used previously excluded all Data Points with a negative range_loss (charging or regenerative breaking) the updated approach makes use of all Data Points in Positions except the ones in Periods of Charging (so now including negative range caused by regenerative breaking). Are you driving up/downhill "a lot" ? That would explain it.

Daily commute is a few hundred feet of elevation

Where are the net/gross screenshots from? I wonder how gross became smaller than net - that might needs some data digging. Do you have any non-closed drives / charges or was a charging process active at the time you took the screenshots?

Value look better now that the charging process isn't active

Trip

Statistics

swiffer commented 1 week ago

Doubtful there's a way to set the decimal separator depending on the locale?

there should be when extracting that into a db variable select 7.5, to_char(7.5, '99D9') but for this to work the language would have had to be set when running initdb on frist-run time. i'm fine with changing this to . instead of ,

While the criteria used previously excluded all Data Points with a negative range_loss (charging or regenerative breaking) the updated approach makes use of all Data Points in Positions except the ones in Periods of Charging (so now including negative range caused by regenerative breaking). Are you driving up/downhill "a lot" ? That would explain it.

Daily commute is a few hundred feet of elevation

looks good to me then - let me see if I can treat open charges better!

Where are the net/gross screenshots from? I wonder how gross became smaller than net - that might needs some data digging. Do you have any non-closed drives / charges or was a charging process active at the time you took the screenshots?

Value look better now that the charging process isn't active

Trip

the trip dashboard seems to have not gotten updated based on this PR, have you made changes to this one? (i renamed the panels). -> never mind, i missed updating the description of mi based values! will be fixed as well.

Statistics

swiffer commented 1 week ago

who takes a decision for count based values, @JakobLichterfeld ? 😄

swiffer commented 1 week ago

@sdwalker made some improvements to consumption gross (2x query exec performance) and stop considering active charging session. charging myself currently and numbers are looking good.

swiffer commented 1 week ago

ok, if I followed the discussions correctly the only thing still open is the naming discussion. I'm voting for # of X but totally fine with removing it completly or using Nb. or No. instead.

If we need more arguments ->https://academia.stackexchange.com/questions/90816/abbreviation-of-number-of-in-a-table-heading 😆 # is the accepted answer ;)

sdwalker commented 1 week ago

@sdwalker made some improvements to consumption gross (2x query exec performance) and stop considering active charging session. charging myself currently and numbers are looking good.

Another expected gross jump? There is an active charging process

Statistics (Monthly) Before

After (w/ 51dfeb5)

Trip Before

After (w/ 51dfeb5)

sdwalker commented 1 week ago

the trip dashboard seems to have not gotten updated based on this PR, have you made changes to this one? (i renamed the panels). -> never mind, i missed updating the description of mi based values! will be fixed as well.

Compared to the dashboard-refinements tree all the differences I'm seeing in my tree are all local changes

swiffer commented 1 week ago

@sdwalker made some improvements to consumption gross (2x query exec performance) and stop considering active charging session. charging myself currently and numbers are looking good.

Another expected gross jump? There is an active charging process

i changed the query within the two locations so that it's faster - in my instance the difference between the two was 1% max.

how do the numbers compare to the overall efficiency dashboard now (especially when comparing with statistics on yearly level).

JakobLichterfeld commented 1 week ago

ok, if I followed the discussions correctly the only thing still open is the naming discussion. I'm voting for # of X but totally fine with removing it completly or using Nb. or No. instead.

If we need more arguments ->https://academia.stackexchange.com/questions/90816/abbreviation-of-number-of-in-a-table-heading 😆 # is the accepted answer ;)

Based on my comment

we should also develop mobile first

I do see, we go with "# of X"

swiffer commented 1 week ago

@JakobLichterfeld - changed the count related stuff.

2 things remaining:

Charges Dashboard -> Efficiency has different visualization when compared to Driving Efficiency (Efficiency, Statistics, Drives). Fine for me if it's a different but @sdwalker pointed that out. I would leave it as is.
I've changed the queries for Ø Consumption (gross) in Statistics and Trip dashboards. In my instance they are nearly 100% accurate to Ø Consumption (gross) in Efficiency dashboard (we cannot use the same query there as it's not working correctly when applying a time filter). Could you review in your instance? I do have other queries in my mind that we could try instead if numbers are off but the one used right now is best trade off in terms of speed, simplicity and accuracy imho. I would go with what we have and maybe adapt if users are reporting data errors.

with that - ready to merge!

JakobLichterfeld commented 1 week ago

@JakobLichterfeld - changed the count related stuff.

Thank you very much!

Could you review in your instance?

Sure, will do. I'm behind schedule but will make sure to test this PR before merging.

JakobLichterfeld commented 1 week ago

Battery Health:

axis labels missing:

this PR:

v1.31.1:

JakobLichterfeld commented 1 week ago

Charging Stats: Missing labels:

this PR:

v1.31.1:

JakobLichterfeld commented 1 week ago

Charging Stats: Top Charging stations get truncated above 1MWh.

This PR:

v1.31.1:

JakobLichterfeld commented 1 week ago

Statistic dashboard is badly slow in this PR, 400x slower or more, still loading

JakobLichterfeld commented 1 week ago

What is the Overhead column in Statistic? (we had average cost per km before)

swiffer commented 1 week ago

What is the Overhead column in Statistic? (we had average cost per km before)

cost per km has been replaced with cost per 100 km / 100 mi (like in charging stats).

Overhead is: 1- (Consumption (net) / Consumption (gross)) - so share of energy not used while driving. agree there could be an improved name for that.

JakobLichterfeld commented 1 week ago

Trip: the last three drive get selected automatically without need to open a new tab, loving it. Wait it is last 7 days, thats typically not a trip imo.

swiffer commented 1 week ago

Trip: the last three drive get selected automatically without need to open a new tab, loving it. Wait it is last 7 days, thats typically not a trip imo.

agree - should we set it to last 24h ?

JakobLichterfeld commented 1 week ago

Trip: values seems a bit more accurate. Milage differes a bit, and consuption.

swiffer commented 1 week ago

Trip: values seems a bit more accurate. Milage differes a bit, and consuption.

good to hear, should be more accurate indeed.

swiffer commented 1 week ago

Statistic dashboard is badly slow in this PR, 400x slower or more, still loading

can you do an explain analyze for this query:

explain analyze WITH data AS (

    SELECT

        p.car_id,

        p.drive_id,

        p.date as __date,

        date_trunc('month', TIMEZONE('UTC', p.date)) as date,

        lag(p.rated_battery_range_km) OVER (ORDER BY p.date) - p.rated_battery_range_km AS range_loss,

        p.odometer - lag(p.odometer) OVER (ORDER BY p.date) AS distance

    from positions p

    WHERE

      p.car_id = '1' and

      p.ideal_battery_range_km is not null and

      p.date BETWEEN '2014-11-16T09:06:22.706Z' AND '2024-11-16T09:06:22.706Z'

    ORDER BY date

)

SELECT

  EXTRACT(EPOCH FROM date)*1000 AS date_from,

  EXTRACT(EPOCH FROM date + interval '1 month')*1000 AS date_to,

  CASE 'month'

    WHEN 'month' THEN to_char(date, 'YYYY Month')

    WHEN 'year' THEN to_char(date, 'YYYY')

    WHEN 'week' THEN 'week ' || to_char(date, 'WW') || ' starting ' || to_char(date, 'YYYY-MM-DD')

    ELSE to_char(date, 'YYYY-MM-DD')

  END AS display,

  date,

  SUM(range_loss * car.efficiency * 1000) / nullif(convert_km(sum(distance)::numeric, 'km'), 0) as consumption_gross_km

  FROM data

  JOIN cars car on car.id = car_id

  where

    -- driving

    drive_id is not null

    -- not driving, but also not charging

    or range_loss >= 0 and distance = 0

    -- alternative approach to detect periods of not driving, but also not charging

    -- or not exists (select 1 from charging_processes cp where __date >= cp.start_date and (__date <= cp.end_date or (cp.end_date is null and EXTRACT(hour FROM __date - cp.start_date) < 72)))

  GROUP BY date;

JakobLichterfeld commented 1 week ago

Sorry for posting first the findings 🙂

This PR is a massive step towards well-aligned and standardized dashboards. The column naming is consistent, understandable, the column width is much better. The look and feel are all of a piece.

Thanks for your great work!

JakobLichterfeld commented 1 week ago

I've changed the queries for Ø Consumption (gross) in Statistics and Trip dashboards. In my instance they are nearly 100% accurate to Ø Consumption (gross) in Efficiency dashboard (we cannot use the same query there as it's not working correctly when applying a time filter). Could you review in your instance?

Same on my end, values look ~100% accurate.

JakobLichterfeld commented 1 week ago

Overhead is: 1- (Consumption (net) / Consumption (gross)) - so share of energy not used while driving. agree there could be an improved name for that.

assumed that but it was not obvious 😉

JakobLichterfeld commented 1 week ago

Battery Health:

axis labels missing:

fixed

JakobLichterfeld commented 1 week ago

Charging Stats: Missing labels:

fixed

JakobLichterfeld commented 1 week ago

Charging Stats: Top Charging stations get truncated above 1MWh.

fixed

JakobLichterfeld commented 1 week ago

Statistic dashboard is badly slow in this PR, 400x slower or more, still loading

can you do an explain analyze for this query:

                                                                                                             QUERY PLAN                                                                                                             
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=209611.34..224660.28 rows=200 width=112) (actual time=66832.775..68996.806 rows=62 loops=1)
   Group Key: data.date
   ->  Nested Loop  (cost=209611.34..219912.98 rows=316020 width=56) (actual time=66342.040..67936.658 rows=303713 loops=1)
         Join Filter: (car.id = data.car_id)
         ->  Subquery Scan on data  (cost=209611.34..215171.67 rows=316020 width=50) (actual time=66306.165..67284.680 rows=303713 loops=1)
               Filter: ((data.drive_id IS NOT NULL) OR ((data.range_loss >= '0'::numeric) AND (data.distance = '0'::double precision)))
               Rows Removed by Filter: 10851
               ->  Sort  (cost=209611.34..210405.67 rows=317733 width=62) (actual time=66163.455..66805.873 rows=314564 loops=1)
                     Sort Key: (date_trunc('month'::text, timezone('UTC'::text, p.date)))
                     Sort Method: external merge  Disk: 15408kB
                     ->  WindowAgg  (cost=0.94..168625.58 rows=317733 width=62) (actual time=1253.728..65080.349 rows=314564 loops=1)
                           ->  Index Scan using "positions_car_id_date__ideal_battery_range_km_IS_NOT_NULL_index" on positions p  (cost=0.42..159887.93 rows=317733 width=28) (actual time=1253.421..59801.740 rows=314564 loops=1)
                                 Index Cond: ((car_id = '1'::smallint) AND (date >= '2014-11-16 09:06:22.706'::timestamp without time zone) AND (date <= '2024-11-16 09:06:22.706'::timestamp without time zone))
         ->  Materialize  (cost=0.00..1.01 rows=1 width=10) (actual time=0.001..0.001 rows=1 loops=303713)
               ->  Seq Scan on cars car  (cost=0.00..1.01 rows=1 width=10) (actual time=35.741..35.741 rows=1 loops=1)
 Planning Time: 35.128 ms
 JIT:
   Functions: 24
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 103.193 ms (Deform 8.144 ms), Inlining 0.000 ms, Optimization 231.209 ms, Emission 1116.938 ms, Total 1451.339 ms
 Execution Time: 69183.871 ms
(21 rows)

swiffer commented 1 week ago

@JakobLichterfeld - thanks, can you try the query below and compare values returned as well as execution speed (via explain analyze)?

data is sampled via charging_processes instead of positions (bringing it in line with efficiency dashboard).

downside of this approach: the shorter the periods are the more likely there aren't enough charging_processes to get (reliable) data (min 2 charging_processes needed within period).

two options to deal with this:

we might be able to switch between the 2 queries depending on the period length selected.
we could combine data of charging_processes and fill up the periods by incl. data from positions only for periods before the first and after the last charge of the period. adds complexity but that's what has been done in trip dashboard until now.

with gen_date_series as (

    select generate_series(date_trunc('month', '2014-11-16T09:31:16.097Z'::timestamptz), date_trunc('month', '2024-11-16T09:31:16.097Z'::timestamptz) + interval '1 month', interval '1 month') as series_id

)

, date_series as (
    select
        series_id at time zone 'UTC' as series_id,
        -- before joining, get beginning of next series to be able to left join `charging_processes`
        lead(series_id) over (order by series_id asc) at time zone 'UTC' as next_series_id
    from
        gen_date_series
)

, periods_x_charging_processes as (

    select
     ds.*,
     lag(cp.end_ideal_range_km) over (partition by ds.series_id order by cp.start_date) as start_ideal_range_km,
     lag(cp.end_date) over (partition by ds.series_id order by cp.start_date) as start_date,
     lag(p.odometer) over (partition by ds.series_id order by cp.start_date) as start_odometer,
     case
         when lag(cp.end_ideal_range_km) over (partition by ds.series_id order by cp.start_date) is null then null
         else cp.start_ideal_range_km
     end as end_ideal_range_km,
     case
         when lag(cp.end_date) over (partition by ds.series_id order by cp.start_date) is null then null
         else cp.start_date
     end as end_date,
     case
         when lag(p.odometer) over (partition by ds.series_id order by cp.start_date) is null then null
         else p.odometer
     end as end_odometer,
     count(start_date) over (partition by ds.series_id) as charging_processes_count
    from date_series ds
        left join charging_processes cp on ds.series_id <= cp.start_date and ds.next_series_id >= cp.end_date 
        left join positions p on cp.position_id = p.id
    where cp.car_id = 1

)

, periods_incl_range_loss_between_charges as (

    select
        series_id,
        next_series_id,
        sum(start_ideal_range_km - end_ideal_range_km) as range_loss,
        max(end_odometer) - min(start_odometer) as distance,
        min(start_date) as start_date, max(end_date) as end_date,
        (sum(start_ideal_range_km - end_ideal_range_km) * cars.efficiency * 1000) / nullif(convert_km((max(end_odometer) - min(start_odometer))::numeric, 'km'), 0) as consumption_gross_km
    from periods_x_charging_processes
        cross join cars
    where (start_ideal_range_km is not null or charging_processes_count = 0) and cars.id = 1 group by 1,2, cars.efficiency order by 1

)

select * from periods_incl_range_loss_between_charges;

swiffer commented 1 week ago

@JakobLichterfeld - you can skip my last comment - please try the latest commit, performance should be way better and data should still be accurate!

JakobLichterfeld commented 1 week ago

@JakobLichterfeld - you can skip my last comment - please try the latest commit, performance should be way better and data should still be accurate!

Ty. Will do by tomorrow or Monday

swiffer commented 1 week ago

Ty. Will do by tomorrow or Monday

thanks for spending the time to review the changes in detail. I've used the time to implement a high precision mode in statistics (off by default) and aligned the query with the one used in trips dashboard (high precision mode enabled by default if time period smaller than 10 days.

sdwalker commented 1 week ago

Ty. Will do by tomorrow or Monday

thanks for spending the time to review the changes in detail. I've used the time to implement a high precision mode in statistics (off by default) and aligned the query with the one used in trips dashboard (high precision mode enabled by default if time period smaller than 10 days.

With high precision = no, several months are without any consumption (gross) values With high precision = yes, the values are present

Statistics

swiffer commented 1 week ago

@sdwalker - thanks reviewing the changes!

how many charges / drives have been recorded in the periods where consumption gross values are missing?

sdwalker commented 1 week ago

@sdwalker - thanks reviewing the changes!

how many charges / drives have been recorded in the periods where consumption gross values are missing?

sdwalker commented 1 week ago

Missing consumption (gross) values are fixed with 25c35ef

JakobLichterfeld commented 6 days ago

Ty. Will do by tomorrow or Monday

thanks for spending the time to review the changes in detail. I've used the time to implement a high precision mode in statistics (off by default) and aligned the query with the one used in trips dashboard (high precision mode enabled by default if time period smaller than 10 days.

Performance of Statistics dashboard still very bad, no result since >1 minute on Rpi3B+

swiffer commented 6 days ago

Ok, that's odd. Could you run an explain analyze of query 4 in the statistics dashboard for me.

(High precision mode is disabled in your case?)

JakobLichterfeld commented 6 days ago

Ok, that's odd. Could you run an explain analyze of query 4 in the statistics dashboard for me.

For sure. Is it cheeky to ask for the direct query? (was not succesfully running the rawsql statement)

I used (assume date is missing):

explain analyze with gen_date_series as (

  select 
    generate_series(
      date_trunc('$period', to_timestamp(${__from:date:seconds})),
      date_trunc('$period', to_timestamp(${__to:date:seconds})),
      interval '1 $period'
    ) as series_id

),

date_series as (

  select
    greatest(series_id, to_timestamp(${__from:date:seconds})) at time zone 'UTC' as series_id,
    least(lead(series_id) over (order by series_id asc), to_timestamp(${__to:date:seconds})) at time zone 'UTC' as next_series_id
  from gen_date_series
  where series_id >= (select date_trunc('$period', p.\"date\" at time zone 'UTC') from positions p where car_id = $car_id and ideal_battery_range_km is not null order by date asc limit 1)

),

buckets_x_charging_processes as (

  select
    ds.*,
    coalesce(cp.car_id, $car_id) as car_id,
    first_value(cp.start_date) over w as min_cp_start_date,
    last_value(cp.end_date) over w as max_cp_end_date,
    first_value(cp.start_${preferred_range}_range_km) over w as range_before_first_cp,
    first_value(p.odometer) over w as odometer_before_first_cp,
    last_value(cp.end_${preferred_range}_range_km) over w as range_after_last_cp,
    lag(cp.end_${preferred_range}_range_km) over w as start_range_km,
    lag(p.odometer) over w as start_odometer,
    case
      when lag(cp.end_${preferred_range}_range_km) over w is null then null
      else cp.start_${preferred_range}_range_km
    end as end_range_km,
    case
      when lag(p.odometer) over w is null then null
      else p.odometer
    end as end_odometer,
    count(start_date) over w as charging_processes_count
  from date_series ds
    left join charging_processes cp on ds.series_id <= cp.start_date and ds.next_series_id >= cp.end_date
    left join positions p on cp.position_id = p.id
  where cp.car_id = $car_id or cp.car_id is null
  window w as (partition by ds.series_id order by cp.start_date rows between unbounded preceding and unbounded following)

),

buckets_incl_charging_processes as (

  select
    series_id,
    next_series_id,
    car_id,
    coalesce(min_cp_start_date, next_series_id) as start_date,
    coalesce(max_cp_end_date, next_series_id) as end_date,
    range_before_first_cp,
    odometer_before_first_cp,
    range_after_last_cp,
    coalesce(sum(start_range_km - end_range_km), 0) as range_loss,
    coalesce(max(end_odometer) - min(start_odometer), 0) as distance
  from buckets_x_charging_processes
  where start_range_km is not null or charging_processes_count < 2
  group by 1, 2, 3, 4, 5, 6, 7, 8

),

buckets_incl_charging_processes_x_surrounding_data as (

  select
    b_incl_cp.*,
    date_trunc('$period', timezone('UTC', series_id)) as date,
    case
      when coalesce(d.start_date, p.\"date\") = first_value(coalesce(d.start_date, p.\"date\")) over w 
        and coalesce(d.start_date, p.\"date\") >= b_incl_cp.end_date
      then range_after_last_cp
      else coalesce(d.start_${preferred_range}_range_km, p.${preferred_range}_battery_range_km)
    end - coalesce(
      lead(coalesce(d.start_${preferred_range}_range_km, p.${preferred_range}_battery_range_km)) over w,
      case
        when coalesce(d.start_date, p.\"date\") < b_incl_cp.start_date
        then range_before_first_cp
        else null
      end
    ) AS range_loss_b_and_a_cp,
    coalesce(
      lead(coalesce(d.start_km, p.odometer)) over w,
      case
        when coalesce(d.start_date, p.\"date\") < b_incl_cp.start_date
        then odometer_before_first_cp
        else null
      end
    ) - coalesce(d.start_km, p.odometer) AS distance_b_and_a_cp
  from buckets_incl_charging_processes b_incl_cp
    left join drives d on
      0 = $high_precision
      and d.car_id = b_incl_cp.car_id
      and (
        d.start_date >= b_incl_cp.series_id
        and d.end_date < b_incl_cp.start_date
        or d.start_date > b_incl_cp.end_date
        and d.end_date < b_incl_cp.next_series_id
      )
    left join positions p on
      1 = $high_precision
      and p.ideal_battery_range_km is not null
      and p.car_id = b_incl_cp.car_id
      and (
        p.\"date\" >= b_incl_cp.series_id
        and p.\"date\" < b_incl_cp.start_date
        or p.\"date\" > b_incl_cp.end_date
        and p.\"date\" <= b_incl_cp.next_series_id
      )
  window w as (partition by b_incl_cp.series_id, coalesce(d.start_date, p.\"date\") >= b_incl_cp.end_date ORDER BY coalesce(d.start_date, p.\"date\"))

)

select
  EXTRACT(EPOCH FROM date)*1000 AS date_from,
  EXTRACT(EPOCH FROM date + interval '1 $period')*1000 AS date_to,
  CASE '$period'
    WHEN 'month' THEN to_char(date, 'YYYY Month')
    WHEN 'year' THEN to_char(date, 'YYYY')
    WHEN 'week' THEN 'week ' || to_char(date, 'WW') || ' starting ' || to_char(date, 'YYYY-MM-DD')
    ELSE to_char(date, 'YYYY-MM-DD')
  END AS display,
  date,
  ((range_loss + coalesce(sum(range_loss_b_and_a_cp), 0)) * c.efficiency * 1000) / nullif(convert_km((distance + coalesce(sum(distance_b_and_a_cp), 0))::numeric, '$length_unit'), 0) as consumption_gross_$length_unit
from buckets_incl_charging_processes_x_surrounding_data
  inner join cars c on car_id = c.id
  where range_loss_b_and_a_cp >= 0 and distance_b_and_a_cp = 0 or distance_b_and_a_cp > 0 or range_loss_b_and_a_cp is null and distance_b_and_a_cp is null
  group by 1, 2, 3, 4, range_loss, distance, c.efficiency
  order by date desc

teslamate-org / teslamate