Feature request: linear grey box system identification

matthewgcooper commented 4 months ago

I find this package great and easy to use for estimating linear state space models of a certain model order. However, estimating parameters in a linear state space model (grey box sysid) has proven quite difficult compared to other system identification toolboxes.

For example, as an aerospace engineer, a common model we are interested in estimating is the linearized short-period aircraft dynamics of the form:

"""
Create a state space model describing the short period pitch rate dynamics of a fixed-wing 
vehicle given delta elevator angle inputs.

# Arguments:
- `Mα`: Pitch moment stability derivative w.r.t. angle of attack
- `Mq`: Pitch moment stability derivative w.r.t. pitch rate
- `Mδe`: Pitch moment stability derivative w.r.t. delta elevator angle
- `Zα`: Force Z stability derivative w.r.t. angle of attack
- `Zq`: Force Z stability derivative w.r.t. pitch rate
- `Zδe`: Force Z stability derivative w.r.t. delta elevator angle
- `V`: Trim velocity (m/s)
"""
function pitchrate_ss(Mα, Mq, Mδe, Zα, Zq, Zδe, V)
    state_labels = Symbol.(["Pitch Rate (rad/s)"; "Angle of Attack (rad)"])
    A = [
        Mq Mα
        (1+Zq / V) Zα/V
    ]
    B = [Mδe; Zδe / V]
    return ss(A, B, I, 0)
end

The function about creates a linear state space model of the short-period aircraft dynamics. Often we want to estimate the parameters Mα, Mq, Mδe, Zα, Zq, Zδe from data.

The nonlinear_pem seems to be the suggested approach in the documentation to solve this problem. However, the results do not seem accurate and the function takes > 20 mins to run (I was using default parameters). Compare this to the MATLAB solution of the problem which takes the form:

p0 = [-170.0, -13.0, -195.0, -430.0, -20.5, 125.0];
aux = {};
T = 0;
m = idgrey('shortperiod',p0,'c',aux,T);
d = iddata(output, input, dt);

tic
m_est = greyest(d, m);
toc

The parameter estimates using the MATLAB solution above are very accurate and takes ~1.3 seconds to run.

I'm not a system identification expert, so I don't know what magic is happening within MATLABs idgrey and greyest functions, but they are simple to use and always seem to provide accurate parameter estimates.

Adding something similar to ControlSystemIdentification.jl would be great in my opinion as these sort of problems are very common in the aerospace field (where we already have a linear model structure, but the parameters are unknown). I understand this already may be possible with this toolbox, and I just haven't used the correct function or keyword arguments, however, adding an easier interface to solve grey box parameter estimation problems or adding a specific example in the documentation would be very useful I believe.

baggepinnen commented 4 months ago

Hello Matthew!

You're right, estimation of arbitrary linear graybox models would currently have to be done using the nonlinear_pem function. Would you mind sharing the code you used when you suffered convergence problems? Maybe I can spot some issues. For example, nonlinear_pem expects a discrete-time model, while your model is continuous time and would have to be discretized before estimation.

In what order do the parameters in the p0 vector appear? [vec(A); vec(B)]?

Another alternative in your case is to estimate a linear model using the function newpem. This model would be a black-box model, but since you know that C = I, you could solve for the parameters in your model structure from the estimated model.

To do this, you would start by transforming the estimated model Pd to continuous time using the function Pc = d2c(Pd). Then you'd compute the similarity transform matrix T = inv(Pc.C), and apply this to the model Pc using similarity_transform(Pc, T). This should give you the model on the structure you want. From the matrix coefficients, you can now figure out Mq, Ma, Mδe directly, and you have to solve trivial equations for Zq, Zα, Zδe

matthewgcooper commented 4 months ago

Here is my code below which you can run which compares the true model to the nonlinear pem model and a model estimated using the subspace method:

using ControlSystemIdentification,
    ControlSystems, Plots, Waveforms, LinearAlgebra, LeastSquaresOptim, StaticArrays

# Model to identify
function pitchrate_ss(Mα, Mq, Mδe, Zα, Zq, Zδe, V)
    A = [
        Mq Mα
        (1+Zq / V) Zα/V
    ]
    B = [Mδe; Zδe / V]
    return ss(A, B, I, 0)
end

# Time
dt = 0.01
t = 0:dt:10

# True Parameters
Mα_true = -83.17457223750834
Mq_true = -4.0485957994781785
Mδe_true = -80.71377329163104
Zα_true = -115.91677356408941
Zq_true = -0.573560626095269
Zδe_true = 32.13261325199936
V_true = 30.2

# Initial parameter estimates and bounds
p0 = [Mα_true, Mq_true, Mδe_true, Zα_true, Zq_true, Zδe_true] .* 1.3
plb = p0 .* 0.5
pub = p0 .* 1.5
lower = [min(x, y) for (x, y) in zip(plb, pub)]
upper = [max(x, y) for (x, y) in zip(plb, pub)]

# Kalman filter parameters
Q = diagm([0.0001, 0.0001])
R = diagm([0.0001, 0.0001])

# Transistion and measurement functions 
function f(x, u, p, t)
    Mα, Mq, Mδe, Zα, Zq, Zδe = p

    P = pitchrate_ss(Mα, Mq, Mδe, Zα, Zq, Zδe, V_true)
    return x + dt * (P.A * x + P.B * u) # Euler integration
end
h(x, u, p, t) = x

# True model to generate data from
P_true = pitchrate_ss(Mα_true, Mq_true, Mδe_true, Zα_true, Zq_true, Zδe_true, V_true)
u = squarewave1.(1 / 3 * t)'
res_true = lsim(P_true, u, t)

# Store SysID data
d = iddata(res_true)

# Estimate the model using subspace method
Pest_subspace = subspaceid(d, 2, zeroD=true)
res_est_subspace = lsim(Pest_subspace, u, t)

# Estimate the model using nonlinear pem
@time pem_result = ControlSystemIdentification.nonlinear_pem(
    d, f, h, p0, zeros(2), Q, R, 1, optimize_x0=false; lower, upper
)
P_pem = pitchrate_ss(pem_result.p..., V_true)
res_pem = lsim(P_pem, u, t)

plot(res_true, lab="true")
plot!(res_est_subspace, lab="subspace")
plot!(res_pem, lab="pem")

The parameter estimates from the nonlinear pem method used above are much better since I fixed the discrete dynamics function as you suggested. It takes around 9s to run as well which is much faster than before. The subspace method still provides the best model match.

To do this, you would start by transforming the estimated model Pd to continuous time using the function Pc = d2c(Pd). Then you'd compute the similarity transform matrix T = inv(Pc.C), and apply this to the model Pc using similarity_transform(Pc, T). This should give you the model on the structure you want. From the matrix coefficients, you can now figure out Mq, Ma, Mδe directly, and you have to solve trivial equations for Zq, Zα, Zδe

What is the best method to get Pd above? And does the similarity matrix force the C matrices to be the same (identity)? For example, is there anyway I can transform the Pest_subspace model above to have an identity C matrix?

Thanks for the help!

baggepinnen commented 4 months ago

Pd is the model returned from subspaceid . If you perform similarity_transform(Pc, T) where T = inv(Pc.C) you will get C = I

Pd = Pest_subspace
Pc = d2c(Pd)
T = inv(Pc.C)
model = similarity_transform(Pc, T)

julia> model
PredictionStateSpace{Continuous, StateSpace{Continuous, Float64}, Matrix{Float64}, Hermitian{Float64, Matrix{Float64}}, Hermitian{Float64, Matrix{Float64}}, Nothing}
A = 
 -4.0485957994781785  -83.17457223750824
  0.9810079262882434   -3.838303760400314
B = 
 -79.75667105910503
   0.7578871258219457
C = 
 1.0  4.440892098500626e-16
 0.0  0.9999999999999998
D = 
 0.0
 0.0

Continuous-time state-space model

baggepinnen commented 4 months ago

Here's a linear graybox identification method, try it out and see if it works on your data

using ComponentArrays, Optim, Optim.LineSearches, Statistics

function structured_pem(
    d,
    nx;
    focus = :prediction,
    p0,
    constructor,
    h = 1,
    metric::F = abs2,
    regularizer::RE = (p, P) -> 0,
    optimizer = BFGS(
        # alphaguess = LineSearches.InitialStatic(alpha = 0.95),
        linesearch = LineSearches.BackTracking(),
    ),
    store_trace = true,
    show_trace = true,
    show_every = 50,
    iterations = 10000,
    allow_f_increases = false,
    time_limit = 100,
    x_tol = 0,
    f_abstol = 1e-16,
    g_tol = 1e-12,
    f_calls_limit = 0,
    g_calls_limit = 0,
    safe = false,
) where {F, RE}
    T = promote_type(eltype(d.y), eltype(p0))
    nu = d.nu
    ny = d.ny
    pred = focus === :prediction
    pd = predictiondata(d)
    sys0, _ = constructor(p0)
    x0i::Vector{T} = if h == 1 || focus === :simulation
        T.(estimate_x0(sys0, d, min(length(d), 10nx)))
    else
        T.(estimate_x0(prediction_error_filter(PredictionStateSpace(sys0, K, 0, 0); h), pd, min(length(pd), 10nx)))
    end
    function predloss(p)
        sysi, Ki, x0 = constructor(p)
        syso = PredictionStateSpace(sysi, Ki, 0, 0)
        Pyh = ControlSystemsBase.observer_predictor(syso; h)
        yh, _ = lsim(Pyh, pd; x0)
        yh .= metric.(yh .- d.y)
        c1 = sum(yh)
        c1 + regularizer(p, syso)
    end
    function simloss(p)
        syssim, _, x0 = constructor(p)
        y, _ = lsim(syssim, d; x0)
        y .= metric.(y .- d.y)
        sum(y) + regularizer(p, syssim)
    end
    local res, sys_opt, K_opt, x0_opt
    res = Optim.optimize(
        pred ? predloss : simloss,
        p0,
        optimizer,
        Optim.Options(;
            store_trace, show_trace, show_every, iterations, allow_f_increases,
            time_limit, x_tol, f_abstol, g_tol, f_calls_limit, g_calls_limit),
        autodiff = :forward,
    )
    sys_opt, K_opt, x0_opt = constructor(res.minimizer)
    sysp_opt = PredictionStateSpace(
        sys_opt,
        pred ? K_opt : zeros(T, nx, ny),
        zeros(T, nx, nx),
        zeros(T, ny, ny),
        zeros(T, nx, ny),
    )
    pred && !isstable(observer_predictor(sysp_opt)) && @warn("Estimated predictor dynamics A-KC is unstable")
    (; sys=sysp_opt, x0=x0_opt, res)
end

function constructor(p, disc=true)
    Mα, Mq, Mδe, Zα, Zq, Zδe = p.p
    V = 30.2
    A = [
        Mq Mα
        (1+Zq / V) Zα/V
    ]
    B = [Mδe; Zδe / V]
    if disc
        return c2d(ss(A, B, I, 0), dt, :fwdeuler), p.K, p.x0
    else
        return ss(A, B, I, 0), p.K, p.x0
    end
end

nx = 2
focus = :prediction
p0ca = ComponentArray(; p=p0, x0 = zeros(nx), K = focus == :simulation ? [] : 1e-10randn(nx,nx))
res = structured_pem(d, nx; focus, p0 = p0ca, constructor, show_every = 1, iterations=300)
model2 = constructor(res.res.minimizer, false)[1]

p_est = res.res.minimizer.p

matthewgcooper commented 4 months ago

Ah that works great doing the similarity transform thanks. And the structured_pem function you provided seems more straightforward to use. Below is my updated code, where I compare the subspace parameter estimates to the nonlinear pem and structured pem estimates:

using ControlSystemIdentification,
    ControlSystems, Plots, Waveforms, LinearAlgebra, LeastSquaresOptim, DataFrames

include("structured_pem.jl")

# Model to identify
function pitchrate_ss(Mα, Mq, Mδe, Zα, Zq, Zδe, V)
    A = [
        Mq Mα
        (1+Zq / V) Zα/V
    ]
    B = [Mδe; Zδe / V]
    return ss(A, B, I, 0)
end

# Time
dt = 0.02
t = 0:dt:10

# True Parameters
Mα_true = -83.17457223750834
Mq_true = -4.0485957994781785
Mδe_true = -80.71377329163104
Zα_true = -115.91677356408941
Zq_true = -0.573560626095269
Zδe_true = 32.13261325199936
p_true = [Mα_true, Mq_true, Mδe_true, Zα_true, Zq_true, Zδe_true]
V_true = 30.2

# Initial parameter estimates and bounds
p0 = [Mα_true, Mq_true, Mδe_true, Zα_true, Zq_true, Zδe_true] .* 1.3
plb = p0 .* 0.5
pub = p0 .* 1.5
lower = [min(x, y) for (x, y) in zip(plb, pub)]
upper = [max(x, y) for (x, y) in zip(plb, pub)]

# Kalman filter parameters
Q = diagm([0.0001, 0.0001])
R = diagm([0.0001, 0.0001])

# Transistion and measurement functions 
function f(x, u, p, t)
    Mα, Mq, Mδe, Zα, Zq, Zδe = p

    P = pitchrate_ss(Mα, Mq, Mδe, Zα, Zq, Zδe, V_true)
    return x + dt * (P.A * x + P.B * u) # Euler integration
end
h(x, u, p, t) = x

# True model to generate data from
P_true = pitchrate_ss(Mα_true, Mq_true, Mδe_true, Zα_true, Zq_true, Zδe_true, V_true)
u = squarewave1.(1 / 3 * t)'
res_true = lsim(P_true, u, t)

# Store SysID data
d = iddata(res_true)

# Estimate the model using subspace method
P_subspace = d2c(subspaceid(d, 2, zeroD=true))
res_est_subspace = lsim(P_subspace, u, t)
T = inv(P_subspace.C)
P_subspace = similarity_transform(P_subspace, T)
p_subspace = [
    P_subspace.A[1, 2],
    P_subspace.A[1, 1],
    P_subspace.B[1],
    P_subspace.A[2, 2] * V_true,
    (P_subspace.A[2, 1] - 1) * V_true,
    P_subspace.B[2] * V_true,
]

# Estimate the model using nonlinear pem
@time pem_result = ControlSystemIdentification.nonlinear_pem(
    d, f, h, p0, zeros(2), Q, R, 1, optimize_x0=false; lower, upper
)
P_pem = pitchrate_ss(pem_result.p..., V_true)
res_pem = lsim(P_pem, u, t)

# Estimate using new structured_pem
nx = 2
focus = :prediction
p0ca = ComponentArray(; p=p0, x0=zeros(nx), K=focus == :simulation ? [] : 1e-10randn(nx, nx))
@time res = structured_pem(d, nx; focus, p0=p0ca, constructor, show_every=1, iterations=2000)
model2 = constructor(res.res.minimizer, false)[1]
p_structured = res.res.minimizer.p
P_structured = pitchrate_ss(p_structured..., V_true)
res_structured = lsim(P_structured, u, t)

percent_error(p_est) = round.(abs.(p_true - p_est) ./ p_true * 100, digits=2)

df = DataFrame(
    "Parameter" => [:Mα, :Mq, :Mδe, :Zα, :Zq, :Zδe],
    "True" => p_true,
    "Subspace" => p_subspace,
    "PEM" => pem_result.p,
    "Structured PEM" => p_structured,
    "Subspace % error" => percent_error(p_subspace),
    "PEM % error" => percent_error(pem_result.p),
    "Structured PEM % error" => percent_error(p_structured),
)
@show df

plot(res_true, lab="true")
plot!(res_est_subspace, lab="subspace")
plot!(res_pem, lab="pem")
plot!(res_structured, lab="structured")

This outputs a data frame showing the following results:

 Row │ Parameter  True         Subspace     PEM         Structured PEM  Subspace % error  PEM % error  Structured PEM % error
     │ Symbol     Float64      Float64      Float64     Float64         Float64           Float64      Float64
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ Mα          -83.1746     -83.1746     -76.2065        -76.4493               -0.0        -8.38                   -8.09
   2 │ Mq           -4.0486      -4.0486      -4.60378        -4.64051              -0.0       -13.71                  -14.62
   3 │ Mδe         -80.7138     -80.7138     -77.5723        -77.957                -0.0        -3.89                   -3.42
   4 │ Zα         -115.917     -115.917     -130.05         -134.306                -0.0       -12.19                  -15.86
   5 │ Zq           -0.573561    -0.573561    -1.11844        -2.96907              -0.0       -95.0                  -417.66
   6 │ Zδe          32.1326      32.1326      20.8862          8.13491               0.0        35.0                    74.68

The subspace method seems great here (when we have no process and measurement noise). The nonlinear pem and structured pem methods seem to have similar error %'s.

So overall you new structured_pem function seems to work, however, the subspace method for this problem is providing the best results.

matthewgcooper commented 4 months ago

If only the first state (pitch rate) was measured, the subspace method returns the following state space:

A = 
 -0.22565433061015883  10.16010002838932
 -9.390248292996485    -7.661245229268344
B = 
   9.573671871535986
 -11.721456784183372
C = 
 -1.9694472473270044  5.2774098575267026
D = 
 0.0

Continuous-time state-space model

Is it possible to use similarity transform to make the C matrix [1 0]? Thanks for the help so far!

baggepinnen commented 4 months ago

Comparing the methods using this artificial data isn't quite realistic, I'm not surprised subspaceid gets 100% fit here.

Is it possible to use similarity transform to make the C matrix [1 0]? Thanks for the help so far!

Unfortunately, you will not be able to infer all the parameters from a black-box fit unless you have direct access to both state variables, i.e., C = I. I'm not even sure whether or not the structured PEM methods are able to identify the parameters correctly in this case, have you verified that the model with C = [1 0] is identifiable? :

using StructuralIdentifiability
ode = @ODEmodel(
    x1'(t) = a11*x1(t) + a12*x2(t) + b1*u(t),
    x2'(t) = a21*x1(t) + a22*x2(t) + b2*u(t),
    y(t) = x1(t)
)
assess_identifiability(ode)

OrderedCollections.OrderedDict{Any, Symbol} with 8 entries:
  x1(t) => :globally
  x2(t) => :nonidentifiable
  a11   => :nonidentifiable
  a12   => :nonidentifiable
  a21   => :nonidentifiable
  a22   => :nonidentifiable
  b1    => :globally
  b2    => :nonidentifiable

In this case, you cannot identify most of the parameters uniquely. You can still get a model that simulates and predicts well using any of the methods, but you will not, in general, get the "correct" parameter values.

baggepinnen commented 4 months ago

If you add

function regularizer(p, _)
    sum(abs2, p.K)
end
res = structured_pem(d, nx; focus, p0 = p0ca, constructor, regularizer, show_every = 1, iterations=300)

and change :fwdeuler in the call to c2d to :zoh, you might get a better model using structured_pem (only applies when using focus = :prediction).

Using :zoh requires the following code as well

using ChainRules, ForwardDiff
using ChainRules: ChainRulesCore
@ForwardDiff_frule LinearAlgebra.exp!(x1::AbstractMatrix{<:ForwardDiff.Dual}) true

matthewgcooper commented 4 months ago

In this case, you cannot identify most of the parameters uniquely. You can still get a model that simulates and predicts well using any of the methods, but you will not, in general, get the "correct" parameter values.

Yes, that makes sense that all the parameters are not identifiable in this case.

I pulled your latest code and re-ran the structured_pem function and I'm getting great results now in < 8 secs with both states observable. I even added unobserved 2nd order actuator dynamics to the model and added measurement noise and the parameter estimates are all < 1% rtol.

Thanks for all your help, and I hope to see structured_pem in the toolbox soon!

baggepinnen commented 4 months ago

PR adding structured_pem

https://github.com/baggepinnen/ControlSystemIdentification.jl/pull/150

baggepinnen commented 4 months ago

Closed by #150

baggepinnen / ControlSystemIdentification.jl

Feature request: linear grey box system identification #149