jonas: using ones - Githubissues

mobeets commented 10 years ago

when using ones column, this should be excluded from the smoothing. how?
also, even when using ones, some non-ones weights seem to be really large, as if they were trying to soak this up?

have tried regressing on Y-Y.mean() but didn't help.

mobeets commented 10 years ago

First off, adding a ones column manually to your data is now unnecessary. Use the fit_intercept=True argument on the Fit() constructor and an offset will be fit.

Secondly, the last weight appears to be large whenever there aren't enough lags used in the design matrix. For example, see below the comparison of fitting with ridge regression using 200 lags vs. 400 lags:

200 400

Notice how all the weights up to the 200th lag are nearly identical.

Finally, the total sum of the weights is nearly preserved: 200 lags has the weights summing to 0.835575495168, while 400 lags has weights summing to 0.837039243475.

So my guess is, these weights are all a) significantly unrelated to the current Y and b) autocorrelated enough that weights on them act effectively as offsets. (The scores in the two fits above are basically the same.) Maybe.

mobeets commented 10 years ago

One additional note: the above fits were fit with fit_intercept=False, but changing this to fit_intercept=True in the 200 lag case doesn't do much (i.e., you still get inflated weights at the end). The intercept value is only 0.99 or so, and the mean of Y is around 6.

mobeets / pyASDRD

jonas: using ones #30