openeemeter / caltrack

Shared repository for documentation and testing of CalTRACK methods
http://docs.caltrack.org
Creative Commons Zero v1.0 Universal
56 stars 14 forks source link

CalTRACK Issue: "Month type" or "week type" control to address intercept-only model issues with more frequent payments #130

Closed carolinemfrancispge closed 1 year ago

carolinemfrancispge commented 5 years ago

Prerequisites

Article reference number in CalTRACK documentation (optional): 3.4.3.1.4

Description

The daily CalTRACK methods seek to find heating and cooling degree day balance points for participanting sites. However, sometimes it's not possible to model the relationship between a site's usage and HDDs and CDDs, because the site's usage is irregular. In these cases the site defaults to an intercept-only model.

This can be disappointing, but it's by design--some sites are simply difficult to model, and that's part of the reason for using a portfolio approach. However, this can present difficulties for natural gas savings analysis, in particular when an aggregator is getting payments throughout the year. An intercept only model is fitted to both heating season and cooling/shoulder season data. This means that usage may often fall above the intercept (which is the counterfactual, in this case) during heating season and below it during other times of the year. Over a year everything will likely even out, but during the winter it will look like the site has negative savings. This can be a problem for an aggregator who's getting paid on a monthly or quarterly basis.

There are various ways one could structure payments to overcome this issue, but I'm curious to know if we could address it within CalTRACK by identifying "heating months" (or maybe "heating weeks") specific to a site, including an indicator variable for heating period (month or week) in the regression, and basically creating two intercept-only models--one for heating and one for non-heating.

My proposal is to test strategies to identify heating periods (which might involve identifying the 3rd or 5th percentile of gas usage and calling periods where a site stays within some percentage of that "non-heating months", but I'm open to other suggestions), then see if including a heating period indicator in the intercept-only model improves fit.

Note: this issue likely only applies to single family homes, and may mostly apply to temperate areas (e.g., northern California) where even during heating season, customers don't use much gas. I'm curious to hear feedback on whether people think this would be an issue elsewhere.

Proposed test methodology

If we decide to agree to a uniform testing approach, I'll follow that. Otherwise, I think the test methodology proposed in #117 could work well here, too.

Acceptance Criteria

  1. Agreement by the group that this is an issue that is worth addressing
  2. Improvement in model fit as per the test methodology
steevschmidt commented 5 years ago

I agree the daily methods should incorporate some method to model monthly variations in building conditions. Thirty data points per month is sufficient to identify such month-by-month variations. We do monthly analysis of residential buildings and regularly see fluctuations in natural gas use (including changing intercepts and balance point temperatures) throughout the year.

To maintain some consistency in methods, could we use the existing hourly model -- or a dumbed down version of it -- with daily data?

KMonsees-NYSERDA commented 5 years ago

Hi all,

It seems this issue is in line with the work that was performed under Issue #57: Include calendar effects (day-of-week, month-of-year, holidays) in daily model?

It seems the working group at the time was able to demonstrate that using dummy variables for day of week and month of year in a daily model had the potential to improve out of sample predictions. Reductions in normalized mean bias error (NMBE) were also demonstrated when using robust regression with a categorical variable for weekdays vs weekends. However, no modifications to the CalTRACK methodology were made as a result of this work because the focus at the time was improving annual predictions, not monthly predictions.

I propose that we reopen Issue #57 in conjunction with Issue 130 and issues like it (see Issue #127) and reevaluate the findings using data from a range of residential and commercial buildings from California, New York, etc. Now that there is a strong use case for accurate monthly predictions (i.e. increasing the frequency of performance payments in a Pay for Performance context), it makes sense as a starting point for further analysis, especially since the CalTRACK working group has already demonstrated the potential of this approach.

I strongly believe we need to keep Issue 130 and issues like it open, as being able to increase the frequency of performance payments in P4P initiatives is a major priority for NYSERDA (and others) and could impact the success of P4P as a framework for incentivizing energy efficiency. However, I’ll need some help with the data analysis, as well as finding suitable data sets outside of New York.

Looking forward to feedback from everyone!

steevschmidt commented 5 years ago

...there is a strong use case for accurate monthly predictions...

Agreed!

Although I would restate this slightly to "accurate month-by-month savings calculations", so as not to unnecessarily assume we need to limit the solution space to monthly/weekly models, or reporting periods that are only a month long (see #127).

One approach that might be useful is to build a data set of buildings currently assigned to intercept-only models by CalTRACK, then use this data set to test the efficacy of different approaches.

steevschmidt commented 5 years ago

Related: It appears the main conclusion of #57 was a decision not to switch to a robust regression model, which was only one possible approach to address the bigger topic of that issue.

HEA came to the same conclusion in 2012 after evaluating the Theil-Sen robust method on residential energy data. We did however chose to employ a LOESS local regression method in 2014 and still use it today with [we believe] good results.

KMonsees-NYSERDA commented 5 years ago

Thanks for weighing in @steevschmidt.

As per @jkoliner’s guidance on working group procedures, I’ll clarify my proposal to extend the work completed under Issue #57 as part of Issue #130.

Additional Description: During the initial testing of Issue #57, four models were fit in accordance with the CalTRACK methodology at that time using data from 100 residential buildings. The model with the best R^2, etc. was selected

The models were then modified using a dummy variable for day-of-week, and then for month-of-year. When these models were tested, they produced better results when attempting to predict consumption less than 365 days out in comparison to the model without the dummy variable (9 days and 180 days for each model respectively). However, there was not a large difference in performance when predicting 365 days out.

Considering use cases like making sub-annual Pay-for-Performance payments could benefit from making predictions sooner than one year (i.e. on a monthly basis), it seems re-testing through a new lens would be an appropriate place to start testing Issue #130.

Expansion of Proposed test methodology 1) Construct an appropriate data set(s) for testing and split each set into two periods for in-sample/out-of-sample testing. Heterogenous data sets using data from around the country would be preferred, but obviously there are limitations to constructing something like this. We might want to consider testing on a billing and interval data set as well.

2) Test the following models with each data set: a. A model constructed using the current daily CalTRACK specifications b. Current CalTRACK specification plus day-of-week dummy variable c. Current CalTRACK specification plus month-of-year dummy variable d. Current CalTRACK specification with robust regression

3) Depending on performance of other models, test the use of a “heating period” variable for each data set as proposed by @carolinemfrancispge: a. Current CalTRACK specification plus “heating period” variable (naïve- monthly level, try select traditional heating and non-heating month groupings) b. Current CalTRACK specification plus “heating period” variable (weekly level, would need to discuss various ways of defining/identifying a “heating week”) c. Additional model variations as determined to be useful by tester(s)

4) Calculate the Normalized Mean Bias Error (NMBE) using each months’ observed and predicted values

5) Select model with best performance based on monthly NMBE for working group consideration

Acceptance Criteria Since the primary goal of this work would be to improve monthly estimates with as little overall impact and complexity as possible:

a) Monthly NMBE shifts closer to zero for at least 6 months out of the year b) Annual NMBE remains constant or moves closer to zero in comparison to the current CalTRACK model specification c) New model does not appear to be overfitting d) CV(RMSE) remains constant or decreases in comparison to the current CalTRACK model specification

Very open to thoughts and ideas on how to improve.

steevschmidt commented 5 years ago

One experiment we ran to check the effects of seasonality was to apply the model developed from a 12 month baseline period to each individual month within the same period. Our premise was that a "good" model should have similar CVRMSE values across all 12 months.

danrubado commented 4 years ago

What is the consequence of this issue for monthly models? Similar effects occur with monthly baseline-only models and I don't believe there is currently a way to try to make seasonal adjustments in the absence of HDD/CDD variables. Is there a similar solution that could be proposed and tested for monthly methods?

KMonsees-NYSERDA commented 4 years ago

This is definitely still an issue when using billing data. From my understanding, we would be able to approach testing for billing and daily methods in the same way when applying a monthly-level variable (month-of-year dummy variable, monthly "heating period" variable). I imagine this would be a good place to start before diving into more granular solutions since I believe simplicity is still a priority for the working group.

philngo-recurve commented 1 year ago

Closing stale issue in preparation for new working group