Setting up a model identification in real life data

dynamicslab / pysindy

A package for the sparse identification of nonlinear dynamical systems from data

https://pysindy.readthedocs.io/en/latest/

Other

1.44k stars 320 forks source link

Setting up a model identification in real life data #491

Closed Esselle7 closed 3 months ago

Esselle7 commented 7 months ago

Hi,

I've tried implementing PySINDy in small examples, and the results are excellent!

Then, I attempted to apply it to a real-life scenario where I have a mechanical component rotating at a certain speed, and the speed level determines the amount of power used. I wanted to try identifying a model that best represents the opposing trends of speed and power based on empirical laboratory observations. The data is correctly obtained from CSVs file and collected into the train_datas list along with the corresponding timestamps in time_datas. Each element of the train_datas list is a matrix containing the power values (first column) and velocity values (second column). Below is an example of the input data I have:

Here's the PySINDy models that should perform identification, but the identification results are poor. I tried to create two models using different function libraries (the generalized library is created as shown in the documentation examples).

Model1

feature_names = ['Power', 'Speed']
STLSQ_optimizer = ps.STLSQ(threshold=0.001)
sr3_optimizer = ps.SR3(threshold=0.001, thresholder="l0")
model = ps.SINDy(feature_names=feature_names, feature_library=generalized_library, optimizer=STLSQ_optimizer)

for x_val, t_val in zip(train_datas, time_datas):
    x_train = np.array(x_val)
    t_train = np.array(t_val)
    model.fit(x_train, t=t_train)

model.print()

Output: (Power)' = 0.000 (Speed)' = 194.693 Power + -0.023 Speed + -49.154 Power^2 + 0.017 Power Speed + -0.014 Power Speed sin(1 Power)

Model 2

feature_names = ['Power', 'Speed']
STLSQ_optimizer = ps.STLSQ(threshold=0.0002)
model = ps.SINDy(feature_names=feature_names,feature_library=poly_library, optimizer=STLSQ_optimizer)
for x_val, t_val in zip(train_datas, time_datas):
  x_train = np.array(x_val)
  t_train = np.array(t_val)
  model.fit(x_train, t=t_train)
model.print()

Output: (Power)' = 0.003 1 + -0.057 Power + 0.102 Power^2 + -0.053 Power^3 + 0.012 Power^4 + -0.001 Power^5 (Speed)' = -59.252 1 + 513.686 Power + -0.088 Speed + -899.275 Power^2 + 0.209 Power Speed + 559.092 Power^3 + -0.117 Power^2 Speed + -48.545 Power^4 + -0.026 Power^3 Speed + -17.134 Power^5 + 0.015 Power^4 Speed

Example of identifications results:

I'm not sure how to proceed and which techniques described in the documentation to implement. Could you guide me on which paths to take?

Feel free to ask if you need further assistance or clarification on any of the points mentioned above!

Jacob-Stevens-Haas commented 7 months ago

Thanks for giving pysindy a try! First question is whether speed and power are evolving independently, i.e. no external control? Second thing is that real-life sensor data often includes weird outliers like the spikes in your input data. Total variation regularization (L1 regularization of derivative) an work well in these cases. Try setting SINDy(differentiation_method=SINDyDerivative(kind="trend_filtered", ...) in building your model. Third, you're using a Generalized library, which mainly serves to assign different libraries to different inputs. Have you tried merely concatenating the libraries instead? Also, it looks like one of your libraries is a polynomial library to 5th degree. This is usually higher than you need. Try setting it to 3rd degree.

nkoudounas commented 6 months ago

This is an interesting case! The large difference in coefficient magnitudes suggests you should normalize your inputs. Power has a maximum value of 5, while Speed reaches 100,000. This scale difference can impact model training.

Jacob-Stevens-Haas commented 6 months ago

To piggy back on what @nkoudounas mentioned, normalization is an optimizer option, disabled by default. To enable it, set normalize_columns=True when you initialize your optimizer.

Esselle7 commented 6 months ago

Thank you @Jacob-Stevens-Haas , thank you @nkoudounas . I'm still implementing the advice you gave me, and I truly thank you very much. In the meantime, let me give you some context details to better understand the signals at hand: The machine from which the data have been collected is a vertical machining center, with a power of 15 kW and a maximum spindle speed of 10,000 rpm. A Siemens controller is mounted on it, which is the element that governs all of its operations. Thanks to the presence of sensors in the machines, the data are automatically collected during the execution of a part-program and stored in the Siemens software Mindsphere with a sampling time of 5 seconds. However, since there is not a signal that identifies when the machine is actively working, it was fundamental to analyze the signal of the spindle power to understand when a piece was being machined. The signals that were exported were:

spindle power, (sum of x-axis power, y-axis power, z-axis power)
actual spindle speed Preprocessing of signals (maybe interesting): the first thing done was that the spindle speed was made positive since the sign of the speed signal only represents if the rotation is clockwise or counterclockwise. Then all the power signals were made non-negative, so all the negative values of power were set to zero because the spindle power is negative only when there are decelerations, which can be ignored. Furthermore, since the value of interest is the total power consumed, a new signal of total power as the sum of the power of the spindle and the power of the three (x, y, z) axes is added.

The total power is always positive and has peaks when the spindle accelerates and goes to a constant speed value, the spindle speed is constant when the operations are performed, and the pressure signal is built as stated above. IMMAGINE

Furthermore, I identify as "events" the moments when the speed remains constant at a certain level. So, in the example in the figure, the identified events would be:

Event 1 in timestamps from 0 to approximately 60
Event 2 in timestamps from approximately 60 to approximately 160
Event 3 in timestamps from 160 to approximately 280 etc.. My ultimate goal would then be to create a model capable of predicting power consumption (thus the power trend) given a speed signal that identifies a sequence of events.

Esselle7 commented 6 months ago

I've implemented the changes to the model, but the performance doesn't improve.

pol_library = ps.PolynomialLibrary(degree=3)
fourier_library = ps.FourierLibrary()
combined_library = pol_library + fourier_library
feature_names = ['Power', 'Speed']

differentiation_method = SINDyDerivative(kind="trend_filtered",  order=0, alpha=1e-2)
sr3_optimizer = ps.SR3(threshold=0.001, thresholder="l1", normalize_columns=True)

model = SINDy(
    feature_names=feature_names,
    differentiation_method=differentiation_method,
    optimizer=sr3_optimizer,
    feature_library=combined_library
    )
# implemented learning rounds for data retrieved in different simulations
for x_val, t_val in zip(train_datas, time_datas):
  x_train = np.array(x_val)
  t_train = np.array(t_val)
  model.fit(x_train, t=t_train)
model.print()

Output model: (Power)' = -11.238 1 + 13.034 Power + 5.691 Power^2 + -2.673 Power^3 + -12.792 sin(1 Power) + 11.215 cos(1 Power) + -0.001 sin(1 Speed) (Speed)' = -34727.991 1 + 57061.497 Power + -0.110 Speed + 16185.974 Power^2 + 0.175 Power Speed + -9766.731 Power^3 + -0.127 Power^2 Speed + -56242.285 sin(1 Power) + 34692.343 cos(1 Power) + 0.464 sin(1 Speed) + -2.683 cos(1 Speed)

And the performance are the followings:

nkoudounas commented 6 months ago

ok i have some questions about your problem. First of all, is Speed affected by Power? Is there some control/hidden component that makes the Power go up when Speed is 0 or Speed go up when Power is 0? Also its clear imo that the dPower/dt is related not only to Speed but also time itself, because of the decreasing values when Speed is constant. Maybe you need a model like dPower/dt = f(Speed(t),Power(t),t) and not only dPower/dt = f(Speed(t),Power(t))

Esselle7 commented 6 months ago

Speed and Power refer to the same mechanical component (the spindle), but the way that they affect each other is something I want to discover. As shown below, there are some case in which Power go up when speed is 0 (highlighted in green), but from my point of view they are only incompatible traces that I should discard (infact situation like that are avoided while training).

What do you mean practically by saying "Maybe you need a model like dPower/dt = f(Speed(t),Power(t),t)"? From a theoretically point of view I agree with you. Do you suggest to include in the PySindy model the directly dependence from the time?

Jacob-Stevens-Haas commented 6 months ago

bottom line up front: I think pysindy isn't the right way to approach this.

sampling time of 5 seconds

If that means there's a single measurement of spindle speed and power consumption ever five seconds, a dynamical systems view is perhaps the wrong approach. The data indicates that changes happen over the course of a single timestep and is relatively constant otherwise, so all of the important derivatives for SINDy are too short for you to measure.

My ultimate goal would then be to create a model capable of predicting power consumption (thus the power trend) given a speed signal that identifies a sequence of events.

If power was fluctuating throughout a constant spindle speed event, pysindy might be appropriate, because pysindy is about differential equations: how some quantity changes as a function of itself. If you're trying to map one variable to another over a period that they remain constant -> just use any regression (SVR, tree, neural net). Or start with the idea that, ceteris paribus, $P\propto v^3$.

spindle power, (sum of x-axis power, y-axis power, z-axis power)

I had assumed the only power was the motor turning the spindle. If there's also power pushing the spindle into the material in two directions, the power equation doesn't work, because ceteris isn't paribus. If that's what you mean by x, y, z power, then it could explain why power is the same at different spindle speeds. I'm guessing that the machine is adjusting to use the same amount of total power regardless of the spindle speed.

there are some case in which Power go up when speed is 0 (highlighted in green),

I'm guessing these are due to the non-negative preproessing; the speed is clockwise in one event, power spikes to change the direction of the spindle and spin counterclockwise in the other direction.

Esselle7 commented 6 months ago

@Jacob-Stevens-Haas Sorry for the delayed response. The 5-second sampling time is not binding for my context in the sense that I can also scale the x-axis in order to have a distance between observations even of a few milliseconds, depending on how much I need to reduce the sampling time (this is because I care about the overall trend, and I need to predict an absolute trend of the power signal). My question would then be: How can I structure a PySindy model that recognizes that the speed signal is indeed the only input (and it should not create a predictive model for this), while the power signal is effectively the variable to be learned in correspondence with the trend of the speed signal? Under the assumption that if I significantly reduce the sampling time, then PySindy is applicable in this context.

Furthermore, I applied a Kalman filter which made the power signal smoother.

Finally, out of mere curiosity, this is what a basic SVM regressor learns. I would like to achieve something similar but much more accurate with PySINDy.

nkoudounas commented 6 months ago

Sorry for the late response.The SVM is learning the function f that maps f(x) = y. The output variable y here is Power and the input is speed (and maybe time I suppose). This is not a time series problem that take into account the time-nature of the problem.This is actually different from the dynamics that PySINDy is trying to learn (dP/dt = f1(P,S) and dS/dt = f2(P,S) ). SVM is not so good because it lacks the power to learn the high frequency, but is very good at low frequency. Maybe try a RNN to model P(t+1) = P(t) +....+ P(t-k)+ S(t) + S(t-k) + t + t-1 + etc etc. The dP/dt model is more general and more difficult imho.

Jacob-Stevens-Haas commented 6 months ago

Part of the problem with a Kalman filter is that it assumes the underlying process is Brownian. The impulsive nature of speed changes suggests using L-1 total variation smoothing (derivative package's kind=trend_filtered). Kalman is smoothing out the bumps, but also spreading out the impulse.

How can I structure a PySindy model that recognizes that the speed signal is indeed the only input (and it should not create a predictive model for this), while the power signal is effectively the variable to be learned in correspondence with the trend of the speed signal

The speed signal should be treated as a control variable, u e.g. model.fit(x=power, t=time, u=speed)

Again, though, if power is constant at steady-state regardless of speed, SINDy isn't a great tool.