adobe / helix-run-query

service that executes queries on BigQuery datasets generated by Helix-Logging
Apache License 2.0
6 stars 11 forks source link

feat(pageviews): add a a pageview and url forecast for the current time period #1055

Closed trieloff closed 9 months ago

trieloff commented 9 months ago

If the current time period is still in progress, then data visualizations often look bad because they show a drop in the current day, week, quarter, etc. This commit adds a simple forecast for the current period based on a heuristic that is giving 50% weight to the current period's data and 50% to the past seven data points, so that a more accurate estimate can be provided

./dev.sh rum-pageviews url www.aem.live granularity 7
Waiting on bqjob_r3cefbe447a319e1c_0000018db2325afd_1 ... (0s) Current status: DONE
  year   month   day            time            url   pageviews   pageviews_forecast   url_forecast
 ------ ------- ----- ------------------------ ----- ----------- -------------------- --------------
  2024       2    12   2024-02-12 00:00:00+00    34        8552                10470             37
  2024       2     5   2024-02-05 00:00:00+00    41       11671                11671             41
  2024       1    29   2024-01-29 00:00:00+00    39        9920                 9920             39
  2024       1    22   2024-01-22 00:00:00+00    39        9500                 9500             39
  2024       1    15   2024-01-15 00:00:00+00    42        8081                 8081             42
  2024       1     8   2024-01-08 00:00:00+00    35        8300                 8300             35
  2024       1     1   2024-01-01 00:00:00+00    22        4630                 4630             22
  2023      12    25   2023-12-25 00:00:00+00    10        2210                 2210             10
  2023      12    18   2023-12-18 00:00:00+00    27        5560                 5560             27
  2023      12    11   2023-12-11 00:00:00+00    41        8361                 8361             41
  2023      12     4   2023-12-04 00:00:00+00    38        9170                 9170             38
  2023      11    27   2023-11-27 00:00:00+00    35        7610                 7610             35
  2023      11    20   2023-11-20 00:00:00+00    39       10290                10290             39
  2023      11    13   2023-11-13 00:00:00+00    35       10210                10210             35
  2023      11     6   2023-11-06 00:00:00+00    33        8542                 8542             33
  2023      10    30   2023-10-30 00:00:00+00    36        9190                 9190             36
  2023      10    23   2023-10-23 00:00:00+00    26        7920                 7920             26
  2023      10    16   2023-10-16 00:00:00+00    22        6940                 6940             22
  2023      10     9   2023-10-09 00:00:00+00    15        2710                 2710             15
  2023      10     2   2023-10-02 00:00:00+00     7         540                  540              7
github-actions[bot] commented 9 months ago

This PR will trigger a minor release when merged.

langswei commented 9 months ago

Overall good. My only question is about an edge case -- when we return a single record, should pageviews_forecast return null as it currently does, or should it return the same value as pageviews? I think I'm okay with the former and will approve this PR, but I can also see the case for returning the latter. If null puts us in some weird spot with the visualization later, we can revisit.

trieloff commented 9 months ago

:tada: This PR is included in version 3.14.0 :tada:

The release is available on:

Your semantic-release bot :package::rocket: