Closed houghb closed 7 years ago
@houghb This is a really interesting writeup. I wonder if you could briefly enumerate your proposed use cases for this approach (#2). Are you suggesting that we use it to cull our 1,000 meter dataset? Or are you suggesting a broader application of this technique?
I am proposing that we remove the premises identified above any time we use our models to make predictions. I don't think we should cull the 1000 home dataset, but before reporting any summary or output statistics we should remove these premises from the set of results. We would do the same when generating weather normalized savings estimates before we enter the aggregation steps.
@houghb I would feel more comfortable putting this in as a recommendation under the Aggregation section. In both your proposal above and in our aggregation recommendations, we are providing guidance as to how to deal with the effects of outliers, or non-standard distributions. However, we shouldn't not report (i.e., censor) the outputs. But rather we should bring attention to them and suggest good methods for handling them (as in above).
Closing this now that a final recommendation has been made on aggregation.
On the call last week we acknowledged significant outliers in the testing dataset and these outliers are making our output statistics less useful.
We explored some different ways to identify outliers during the model selection process (where we train our models on one year of pre-treatment data, then test the model performance on a second year of pre-treatment data). The outlier detection approach we are proposing can also be used in the final specs to remove outliers from weather normalized savings estimates.
Here are the different approaches that we considered with some notes:
An outlier is a premise where the annual usage changes >30% from the training pre-treatment year to the testing pre-treatment year
An outlier is a premise where the absolute value of the fractional savings is greater than 0.75
(predicted_daily_use.sum() - daily_use.sum()) / daily_use.sum()
An outlier is a premise with a fractional savings value in the top or bottom X percentile of the results
X = 2% drops 4% of premises from the electric results
X = 1% drops 2% of premises from the electric results
An outlier is a premise with fractional savings more than X standard deviations away from the median
Recommendation
The two viable approaches to outlier detection we explored (No. 2 and No. 3 above) each attempt to do different things:
Approach No. 3 requires a distribution of results to determine what the outliers are, and it is possible in this scenario for a premise to be dropped as an outlier when it is part of one subsample of the available premises, but if you re-run the analysis with a larger/smaller/different subset of premises it might not be an outlier anymore.
In contrast, approach No. 2 can be applied at the premise level and determines for each premise whether our estimate is "good" or not based on the logic that we should not see more than 75% savings. Something identified as a poor estimate under this approach will always be discarded, even if additional premises are run.
Essentially these two approaches are for different things -- No. 3 identifies true outliers in some distribution, No. 2 identifies bad estimates. We are recommending No. 2 because it hits several birds with the same stone: getting rid of outliers, but also filtering out bad models before aggregation.
To make sure I'm clear, I am proposing that we add the following to the analysis spec: _"Remove premises for which the absolute value of the fractional savings is greater than 0.75, where fractional savings is defined as
(total_annual_predicted_use - total_annual_actual_use) / total_annual_actual_use
"_Output comparison for electric (before and after removing outliers)