DataDog / piecewise

## Auto-archived due to inactivity. ## Functions for piecewise regression on time series data
BSD 3-Clause "New" or "Revised" License
104 stars 35 forks source link

Should the predictor variables *really* be unique? #5

Closed Ezibenroc closed 6 years ago

Ezibenroc commented 6 years ago

The code currently has this safety check: https://github.com/DataDog/piecewise/blob/3a15a1c3113cbbecf979bb318f19f2c7fbdc9408/piecewise/regressor.py#L215-L216

This is a problem for me, as I have several occurences of each predictor variable.

I tried to remove this check and run the following code, where each predictor variable is repeated 10 times:

import numpy
from piecewise.plotter import plot_data_with_regression
nbrep=10
t = numpy.concatenate([numpy.arange(10)]*nbrep)
v = numpy.array((
    [2*i for i in range(5)] +
    [10-i for i in range(5, 10)])*nbrep
) + numpy.random.normal(0, 1, 10*nbrep)
plot_data_with_regression(t, v)

It gives this result, which seems great to me: telechargement

So, do you think that we could safely remove this safety check, or is there any corner case I didn't think of?

StephenKappel commented 6 years ago

Based on the way it's currently written, I believe it would be possible for different points at the same t to be modeled with two (or more) different linear segments. From a prediction perspective, this obviously makes no sense. However, it seems possible to prevent this problematic case by making the initialization a little smarter, so that it initializes all points with the same ts as modeled by the same linear segment. Since it does seem useful to handle this case, I'll take a pass at changing the initialization and creating a test to cover this case. PR to follow....

StephenKappel commented 6 years ago

@Ezibenroc - I've created a PR for this: https://github.com/DataDog/piecewise/pull/6

How does that look?

StephenKappel commented 6 years ago

I've merged the PR, so closing this issue. Feel free to follow-up if I missed anything.

Ezibenroc commented 6 years ago

It looks good, thank you!