jamesrobertlloyd / gpss-research

Kernel structure discovery research code - likely to be unstable
MIT License
189 stars 73 forks source link

This is part of the automatic statistician project

Automatic Bayesian Covariance Discovery

This repo contains the source code to run the system described in the paper

Automatic Construction and Natural-Language Description of Nonparametric Regression Models by James Robert Lloyd, David Duvenaud, Roger Grosse, Joshua B. Tenenbaum and Zoubin Ghahramani, appearing in AAAI 2014.

Abstract

This paper presents the beginnings of an automatic statistician, focusing on regression problems. Our system explores an open-ended space of statistical models to discover a good explanation of a data set, and then produces a detailed report with figures and natural-language text. Our approach treats unknown regression functions nonparametrically using Gaussian processes, which has two important consequences. First, Gaussian processes can model functions in terms of high-level properties (e.g. smoothness, trends, periodicity, changepoints). Taken together with the compositional structure of our language of models this allows us to automatically describe functions in simple terms. Second, the use of flexible nonparametric models and a rich language for composing them in an open-ended manner also results in state-of-the-art extrapolation performance evaluated over 13 real time series data sets from various domains.

Feel free to email the authors with any questions:
James Lloyd (jrl44@cam.ac.uk)
David Duvenaud (dduvenaud@seas.harvard.edu)
Roger Grosse (rgrosse@cs.toronto.edu)

Data used in the paper

Related Repo

Source code to run an earlier version of the system, appearing in Structure Discovery in Nonparametric Regression through Compositional Kernel Search by David Duvenaud, James Robert Lloyd, Roger Grosse, Joshua B. Tenenbaum, Zoubin Ghahramani
can be found at

github.com/jamesrobertlloyd/gp-structure-search/.