Open yihe-61wu opened 1 year ago
I did some simple test with 2-segment piecewise linear regression (unspecified breakpoints). I could output the breakpoints, which looked correct, and then I obtained yHat[0], (yHat[breakpointidx] - yHat[0]) / breakpointidx, (yHat[-1] - yHat[breakpointidx]) / (len(yHat) - breakpointidx)
, which should be expected to be very close to beta[0], beta[1], beta[2]
. However, I could see the value of beta[-1]
was different from (yHat[-1] - yHat[breakpointidx]) / (len(yHat) - breakpointidx)
even by the sign.
The source code appears all solid to me. I used fitfast
but tested fit
too -- both called fit_with_breaks
, which altered beta
via lstsq
, seemingly a wrapper for scipy.linalg.lstsq
.
I guess the issue is probably not a bug in the code, but in my interpretation. Any comments would be appreciated.
beta[-1]
can be negative! You just haven't had a curve with a steep enough slope yet at the end. The final slope needs to cancel out the accumulation of all previous slopes. Slopes on the right are the accumulation of all previous beta values (except the y-intercept aka. beta[0]).
See
import numpy as np
import matplotlib.pyplot as plt
import pwlf
# generate sin wave data
x = np.linspace(0, 11, num=100)
y = np.sin(x * np.pi / 2)
# add noise to the data
y = np.random.normal(0, 0.05, 100) + y
my_pwlf_1 = pwlf.PiecewiseLinFit(x, y, degree=1)
# fit the data for four line segments
res1 = my_pwlf_1.fitfast(5, pop=50)
# predict for the determined points
xHat = np.linspace(min(x), max(x), num=10000)
yHat1 = my_pwlf_1.predict(xHat)
print('beta', my_pwlf_1.beta)
# plot the results
plt.figure()
plt.plot(x, y, 'o', label='Data')
plt.plot(xHat, yHat1, '--', label='degree=1')
plt.legend()
plt.show()
beta [ 1.06781215 -0.5918096 1.87834207 -2.49662666 2.37475738 -2.42097531]
Thank you for the package. It helps me get piecewise linear fittings easily, which is great for my visualisation.
However, trying to further my analysis one more step, I extracted beta values. Weirdly, the last beta value, i.e., beta[-1], is always positive, while on my plot I can clearly see the last segment of my fitted curves is decreasing. I had a look at the Examples, and apparently under Section 'non-linear standard errors and p-values' the summary table shows the last beta to be positive, while the last segment in the above figure is decreasing.
Given the examples are official and nobody else seems to raise issues on a similar matter, I think it must be me who misunderstood the interpretation of beta values. Could you please help me clarify the issue?
Thank you in advance.