buds-lab / building-prediction-benchmarking

An array of open source ML models applied to long-term hourly energy prediction for institutional buildings
http://www.budslab.org/
MIT License
26 stars 4 forks source link

More buildings produce more generalizable modeling frameworks: Benchmarking open-source prediction methods on public building electrical meter data

Published in the journal Machine Learning and Knowledge Extraction in 2019 - find citation below:

Miller, Clayton. 2019. "More Buildings Make More Generalizable Models—Benchmarking Prediction Methods on Open Electrical Meter Data" Machine Learning and Knowledge Extraction 1, no. 3: 974-993. https://doi.org/10.3390/make1030056

Abstract

Prediction is a common machine learning (ML) models used on sub-hourly building energy consumption data. This process is valuable for anomaly detection, load profile-based control, energy plant systems control, and measurement and verification procedures. Literally hundreds of building energy prediction techniques have been developed over the last three decades, yet there is still no consensus on which techniques are the most effective for various building types. In addition, many of the techniques developed are proprietary and unavailable to the general research community. This paper outlines a library of open source regression techniques from the Scikit-Learn Python library and describes the process of applying them to open hourly electrical meter data from 482 non-residential buildings from data from the Building Data Genome Project.

The results illustrate that there is no one size-fits-all modeling solution and that various types of temporal behavior are difficult to capture using machine learning. This framework and methodology is designed to be a baseline implementation for other building energy data prediction methods developed by commercial providers or the wider research community. The benchmark data set can also be expanded with numerous other building performance data from a wider representation of buildings from around the world. The use of a baseline data set in future prediction research results in comparability and reproducibility of techniques in the built environment domain.

Links to folders in this repository for each step in the process: