Closed vkhodygo closed 6 years ago
This is a good recommendation. I think the default should be sorted_data=False
since this will work for both use cases: 1) The data was already sorted. 2) The data was not sorted.
The doc string should mention something about sorted_data, perhaps something like
# # initialize for x,y data if your data is already ordered
# # from x[0] < x[1] < ... < x[n-1] use sorted_data=True
# my_pWLF = PiecewiseLinFit(x, y, sorted_data=True)
The predict()
function could also have a similar key if your x
values where already sorted.
Thanks for the addition!
Ok, I'm going to update the code now.
Upd. I realized that there is no need to repeat this procedure in the predict()
function. We re-order data in the __init()__
and can use self.x_data
where it's necessary.
The x
in def predict(self, x, *args)
is not the same as self.x_data
(Although it can be...)
The purpose of predict() is to evaluate your fitted piecewise linear equations at new x locations. This can be just one point, or a 1D array of points.
In my mind my predict should have it's own sorted_data
key since your original data might be sorted, but the new points are not, and vice versa.
So if you used something like np.linspace()
to get new data points it would make sense to use the key
y_hat = myPWLF.predict(x_hat, sorted=True)
Sorry, it took some time. I've reverted this commit, hope that now everything is OK.
Uploaded version 0.2.6 to PyPI! Thanks!
Hey, I was thinking about possible code improvements and this one looks reasonable. I see no reason in additional sorting when you know a priori that your data is ordered. However, I'm still not sure what default value should be used.