Detailed control over computational graphs

Symbolics may be used to construct a computational graph, code for which can be generated by build_function. I have run into a number of hurdles trying to pursue this approach that I will list here, with an example, in the hope that someone can point me to better ways of going about this business, or identify areas of improvement for Symbolics.

The first point is that all symbolic tracing using Symbolics is eager. This can very fast lead to expressions too large for the compiler and rewriters to handle, the example below highlights this.

One potential remedy to the point above could be to @register functions such that they are not symbolically traced, at least initially. This is made difficult by

https://github.com/JuliaSymbolics/SymbolicUtils.jl/issues/402
https://github.com/JuliaSymbolics/Symbolics.jl/issues/438
439

i.e., @register is currently very limited. Another limiting factor is that the call to a @registered function with a return type is awkward to work with. If we make some improvements to @register so that it handles more scenarios, and we can build large computational graphs, we would then like options for how to differentiate through it. Symbolics naturally uses a symbolic approach to calculate jacobians etc., but how do we differentiate through the functions we registered? Could the jacobian expression contain calls to symbolic jacobian functions for the registered functions or something like that? https://symbolics.juliasymbolics.org/dev/tutorials/symbolic_functions/#Registering-Functions-1 explains how to do this manually, but ideally it should be done automatically somehow, e.g., @register_auto_jacobian or something like that.

The example below follows https://github.com/casadi/casadi/blob/master/experimental/blocksqp_multiple_shooting.py building a computational graph for an optimization problem, no matter how small I make the problem, the call to @time jacfun = build_function(A, w, expression = Val(true)); takes an enormous amount of time to finish. If it does finish, the generated function can not be called without my machine running out of memory.

using StaticArrays
using Statistics, LinearAlgebra
using ModelingToolkit, Symbolics

function rk4(f::F, l::LT, Ts) where {F, LT}
    function (x, u, t)
        f1, L1 = f(x, u, t), l(x, u, t)
        f2, L2 = f(x + Ts / 2 * f1, u, t + Ts / 2), l(x + Ts / 2 * f1, u, t + Ts / 2)
        f3, L3 = f(x + Ts / 2 * f2, u, t + Ts / 2), l(x + Ts / 2 * f2, u, t + Ts / 2)
        f4, L4 = f(x + Ts * f3, u, t + Ts), l(x + Ts * f3, u, t + Ts)
        x += Ts / 6 * (f1 + 2 * f2 + 2 * f3 + f4)
        L  = Ts / 6 * (L1 + 2 * L2 + 2 * L3 + L4)
        return x, L
    end
end

function dynamics(x, u, _)
    mc, mp, l, g = 1.0, 0.2, 0.5, 9.81
    q  = x[SA[1, 2]]
    qd = x[SA[3, 4]]
    s = sin(q[2])
    c = cos(q[2])
    H = @SMatrix [mc+mp mp*l*c; mp*l*c mp*l^2]
    C = @SMatrix [0 -mp*qd[2]*l*s; 0 0]
    G = @SVector [0, mp * g * l * s]
    B = @SVector [1, 0]
    qdd = -H \ (C * qd + G - B * u[1])
    return [qd; qdd]
end

function loss(x,u,t)
   c = x'Q1*x + u'Q2*u
   #  c = dot(x, Q1, x) + dot(u, Q2, u) # this formulation threw an error for me
    c
end

function final_cost(x)
    x'Q1*x 
end

nu = 1 
nx = 4 
Ts = 0.02 # sample time
N = 2  # Time horizon (set very small to not take too long time generating symbolic functions) realistic N are in the hundreds
x0 = zeros(nx) # Initial state
x0[1] = 3 #  pos
x0[2] = pi*0.5 #  angle
xr = zeros(nx) # reference state

Q1 = diagm(Float64[1, 1, 1, 1]) # cost matrix
Q2 = Ts * diagm(ones(nu))        # cost matrix

#  limits
umin = -10 * ones(nu)
umax = 10 * ones(nu)
xmin = -50 * ones(nx)
xmax = 50 * ones(nx)

# @register dynamics(x::Vector{Num}, u::Vector{Num}, t::Int)::Vector{Num} # I tried this, but it doesn't actually create any new methods
# @register loss(x::Vector{Num}, u::Vector{Num}, t)::Vector{Num}

discrete = rk4(dynamics, loss, Ts) # discretize the loss integral and continuous dynamics

## Build symbolic representation of optimization problem.
w   = [] # variables
w0  = [] # initial guess
lbw = [] # lower bound on w
ubw = [] # upper bound on w
g = []   # equality constraints

L = 0 # loss value
@variables x[1:nx](1) # initial value variable
x = collect(x) # Symbolic arrays throw errors in many places

append!(w, x)
append!(w0, x0)
append!(lbw, x) # Initial state is fixed
append!(ubw, x)

for n = 1:N # for whole time horizon N
    global x, L 
    @variables u[1:nu](n)
    u = collect(u)
    append!(w, u)
    append!(w0, 0) # TODO: add u0
    append!(lbw, umin)
    append!(ubw, umax)
    xp, l = discrete(x, u, x[1])
    L += l

    @variables x[1:4](n+1) # x in next time point
    x = collect(x) 
    append!(w, x)
    append!(w0, zeros(nx)) # TODO: add warmstart
    append!(lbw, xmin)
    append!(ubw, xmax)
    append!(g, xp .- x) # propagated x is x in next time point

    L += final_cost(x)
end

##

@time A = Symbolics.sparsejacobian(g, w); # These calls are reasonably fast, but can take up to a minute for problems of realistic size
@time dw = Symbolics.gradient(L, w)
@time H = Symbolics.sparsehessian(L, w)

@time jacfun = build_function(A, w, expression = Val(true)); # takes forever
# 222.257406 seconds (963.52 M allocations: 40.937 GiB, 15.66% gc time, 0.41% compilation time)
# @time jacfun(w0) # out of memory error after 30 seconds
@time hessfun = build_function(H, w, expression = Val(false))

Lfun = build_function([L], w, expression = Val(false))
lbfun = build_function(lbw, w, expression = Val(false)) # these work fine
ubfun = build_function(ubw, w, expression = Val(false))

res = lbfun[1](w0)
@test res[1:nx] == w0[1:nx] # passes

┆Issue is synchronized with this Trello card by Unito

I've found that you can "register" symbolic functions using @syms in addition to @register. The following tries to use that to generate a computational graph that includes function calls. I'm struggling to get this approach to work nicely as well, mostly due to scoping issues it seems:

n = 2
A = randn(n,n)
@variables x[1:n]

foo(x) = A*x # a function to represent symbolically

@syms foo(x::Symbolics.Arr{Num, 1})::Symbolics.Arr{Num, 1} # This fails with invalid redefinition of constant foo
@syms baz(x::Symbolics.Arr{Num, 1})::Symbolics.Arr{Num, 1}
# Problem: one cannot create a function variable with the same name as an existing function in global scope

foo(x) # this produces the desired symbolic multiplication, where `A` is captured by reference
baz(x) # This creates the completely symbolic baz(x[1:2])

# Try to make use of the symbolic call to a function
function symbolic_call(n)
    @variables x[1:n]
    @syms foo(x::Symbolics.Arr{Num, 1})::Symbolics.Arr{Num, 1} # I can now create foo since I'm in a local scope
    foo(x) # return a symbolic call to foo
end

x0 = randn(n)
ex = symbolic_call(n)
fun_genf = build_function(ex, x, expression=Val{false})
fun_genf(x0) # UndefVarError: foo not defined
# Problem: One cannot use fun from expression=true due to scoping issues?

# Generate an expression instead and eval it manually
fun = build_function(ex, x, expression=Val{true})
fun_eval = eval(fun)
@test fun_eval(x0) == A*x0 # Works, nice :)

# Try to provide the hidden argument `expression_module` to solve the scoping issue
fun_genf = build_function(ex, x, expression=Val{false}, expression_module=Main) # UndefVarError: #_RGF_ModTag not defined

## Derivatives

Symbolics.jacobian(foo(x), x) # Only zeros :(
Symbolics.jacobian(ex , x) # MethodError: no method matching jacobian(::SymbolicUtils.Term{Symbolics.Arr{Num, 1}, Nothing}, ::Symbolics.Arr{Num, 1})
# Problem: Does not understand that ex represents an array variable

Problem summary:

one cannot create a function variable with the same name as an existing function in global scope
Generated functions have scoping issues and the expression_module keyword argument errors if used
Derivatives of the symbolic function calls are wrong or error. Array variables are not recognized by jacobian

Some of the code above in the form of @test, @test_skip as well as a fix for the first jacobian case above is available in

I think the problems that tracing is too eager, should not be solved by correctly using register functions. Instead, build_function should not be too eager.

using Symbolics

function func(x;i)
    y = 0
    for i in 1:i
        y+= sqrt(y+x)
    end
    y
end

@variables x
func(x,i=3) #sqrt(x) + sqrt(x + sqrt(x)) + sqrt(x + sqrt(x) + sqrt(x + sqrt(x)))
y = func(x,i=20) 
f = build_function(y,x) # don't even try to run this. This will crash your computer

The built function should be something like

function(x)
    y1 = sqrt(x)
    y2 = y1 + sqrt(y1 + x)
    y3 = y2 + sqrt(y2 + x)
    ...
    return  y20
end

instead of sqrt(x) + sqrt(x + sqrt(x)) + sqrt(x + sqrt(x) + sqrt(x + sqrt(x))) + ... because the latter does way too many calculations.

Things like that would also allow you to directly trace odesolves with constant timestep etc.

I'm pretty sure casadi does this by doing each operation into a seperate line.

function(x)
    z1 = sqrt(x) # This is y1
    z2 = z1 + x
    z3 = sqrt(z2)
    z4 = z1 + z3 # This is y2
    ...
end

Making the derivative function of this function is also pretty easy (you need an extra line for each line in the function)

Registering functions will still be required since there might be functions you just don't want to trace at all, either because they are way too large, or they contain code that is non-traceable (C-code etc.). If you'd set i= 10_000_000 in your example, the compiler would not handle the unrolled code, it only way to get that to compile would be to keep it as a loop.

I agree! Maybe this issue wasn't the right place to make that remark.

My point was more that registering function should be avoided as much as possible and they shouldn't be a requirement for an Optimal Control Poblem like this in the future. But ofcourse, now Symbolics.jl its build_function isn't at this type of maturity yet so it is required. Even for my particular case, in the future, something similar to IfElse.ifelse might be possible but then with for-loops. Which then would generate functions which also contain a for loop.

In the far future, registering functions should imo only be needed, when a smart derivative can be done without going through all the internal code like sin(x), or with c-code like you said.

The built function should be something like

That's what the CSE functionality is doing.

IR and Symbolics have different use cases. Use the right representation for the right purpose and you will be rewarded.

JuliaSymbolics / Symbolics.jl

Detailed control over computational graphs #435

439

445