cjekel / piecewise_linear_fit_py

fit piecewise linear data for a specified number of line segments
MIT License
300 stars 60 forks source link

How to fit multiple functions simultaneously #108

Open tanzor87 opened 1 year ago

tanzor87 commented 1 year ago

Hello! Is it possible to fit several Functions together using this library? For example, we have values for x and two functions y(x):

x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]) y1 = np.array([5, 7, 9, 11, 13, 15, 28.92, 42.81, 56.7, 70.59, 84.47, 98.36, 112.25, 126.14, 140.03]) y2 = np.array([0.33, 1.0, 1.67, 2.34, 3.0, 3.67, 8.31, 12.94, 17.57, 22.19, 26.82, 31.45, 36.0, 40.71, 45.34])

The breakpoints are the same for y1 and y2, but the slopes are different. The task is to approximate these functions simultaneously so that the break point does not differ for these functions. If the functions y1 and y2 are fitted separately, then there will be a slight difference between the breakpoint for the first function and the breakpoint for the second function, but it is necessary that they be the same. Thank you very much!

cjekel commented 1 year ago

If you happen to know the breakpoints locations for the two curves, then this is pretty easy.

So you want to find breakpoints that minimize the L2 error between (yhat1, y1) and (yhat2, y2)?

The library doesn't support that, but it's possible to hack something up to solve that. I think you will end up init two PiecewiseLinFit classes, one for each dataset, then combining them in a single custom optimization routine.

cjekel commented 1 year ago

So this is one way to do such a fit. It's rather ugly, but I think it works.

image
import numpy as np
from scipy import optimize
import pwlf
import matplotlib.pyplot as plt

x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
             dtype=np.float64)
y1 = np.array([5, 7, 9, 11, 13, 15, 28.92, 42.81, 56.7, 70.59, 84.47, 98.36,
               112.25, 126.14, 140.03])
y2 = np.array([0.33, 1.0, 1.67, 2.34, 3.0, 3.67, 8.31, 12.94, 17.57, 22.19,
               26.82, 31.45, 36.0, 40.71, 45.34])

pwlf1 = pwlf.PiecewiseLinFit(x, y1)
pwlf1.use_custom_opt(2)
pwlf2 = pwlf.PiecewiseLinFit(x, y2)
pwlf2.use_custom_opt(2)

def mse(breakpoint):
    print(breakpoint, breakpoint.shape, breakpoint.ndim)
    ssr1 = pwlf1.fit_with_breaks_opt(breakpoint)
    mse1 = ssr1 / pwlf1.n_data
    ssr2 = pwlf2.fit_with_breaks_opt(breakpoint)
    mse2 = ssr2 / pwlf2.n_data
    return mse1 + mse2

x0 = np.array([x.mean()])
bounds = np.array([[x.min(), x.max()]])
breakpoints = optimize.minimize(mse, x0, bounds=bounds, method='L-BFGS-B')

# fit with the optimal breakpoints
pwlf1.fit_with_breaks_opt(breakpoints['x'])
pwlf2.fit_with_breaks_opt(breakpoints['x'])

xhat = np.linspace(x.min(), x.max(), 100)
yhat1 = pwlf1.predict(xhat)
yhat2 = pwlf2.predict(xhat)

plt.figure()
plt.plot(x, y1, 'ob')
plt.plot(x, y2, 'og')
plt.plot(xhat, yhat1, '-b')
plt.plot(xhat, yhat2, '-g')
plt.show()
tanzor87 commented 1 year ago

Thank you very much for your help! I will try this script.