dist-shift

Application that naively learns continuously in production to make a model robust to concept drift.

We are working with 2 different types of concept drift:

Sudden Concept Drift When the model is trained on data that merely approximates the production data, and the production data never really resembles the train/test data.
Incrementeal Concept Shift: When the function from inputs to outputs actually drifts away from what the model is trained on. In this case, the production data starts out similar to the train/test data, and then changes over time.

That took the form of two different synthetic datasets:

The Sudden Shift Sine dataset

The temporal Sine Wave dataset (gradual drift)

Our algorithms

The idea here is to continuously learn in production, whether or not a concept drift is detected.

Online retraining

We propagate the loss for a each data point seen in production
Small Batch Online Retraining

We propogate the loss for a small batch of data seen in production. We hypothesized that this would help us be more resilient to noise in the production data.

We used the error rate at the end of the production time series to evaluate the algorithm performance.

A very cool program by Joe Redmond and Tamanna Ananna.