infiniteopt / InfiniteOpt.jl

An intuitive modeling interface for infinite-dimensional optimization problems.
https://infiniteopt.github.io/InfiniteOpt.jl/stable
MIT License
251 stars 17 forks source link

Support Management with Quadratures other than Trapezoid #84

Closed mzagorowska closed 3 years ago

mzagorowska commented 3 years ago

In brief, if I want to use any other quadrature than trapezoid in the hovercraft example, the constraints with derivatives are not considered in the model. The minimal working example contains 4 cases (to comment/uncomment):

The code with all the cases:

using InfiniteOpt, JuMP, Ipopt
using Plots

# Initialize the model
m = InfiniteModel(optimizer_with_attributes(
    Ipopt.Optimizer,
    "print_level" => 3,
))
#########CASE 1####################################
# Set the parameters, variables, and the objective - one integration over the
# whole period, 61 supports for time, default waypoints
xw = [1 4 6 1; 1 3 0 1] # positions waypoints
tw = [0, 25, 50, 60]    # times

@infinite_parameter(m, t in [0, 60],num_supports=61)

@infinite_variable(m, x[1:2](t), start = 1) # position
@infinite_variable(m, v[1:2](t), start = 0) # velocity
@infinite_variable(m, u[1:2](t), start = 0) # thruster input

@expression(m, Integ, ∫(u[1]^2 + u[2]^2, t, eval_method=GaussLegendre))
@objective(m, Min, Integ)
# ########CASE 2#####################################
# # Set the parameters, variables, and the objective - one integration over the
# # whole period, 4 supports for time, default waypoints
# xw = [1 4 6 1; 1 3 0 1] # positions waypoints
# tw = [0, 25, 50, 60]    # times
#
# @infinite_parameter(m, t in [0, 60],num_supports=4)
#
# @infinite_variable(m, x[1:2](t), start = 1) # position
# @infinite_variable(m, v[1:2](t), start = 0) # velocity
# @infinite_variable(m, u[1:2](t), start = 0) # thruster input
#
# @expression(m, Integ, ∫(u[1]^2 + u[2]^2, t, eval_method=GaussLegendre))
# @objective(m, Min, Integ)
#
# ########CASE 3#######################################
# # Set the parameters, variables, and the objective - integration between the
# # default waypoints, 4 supports for time
# xw = [1 4 6 1; 1 3 0 1] # positions waypoints
# tw = [0, 25, 50, 60]    # times
#
# @infinite_parameter(m, t in [0, 60],num_supports=4)
#
# @infinite_variable(m, x[1:2](t), start = 1) # position
# @infinite_variable(m, v[1:2](t), start = 0) # velocity
# @infinite_variable(m, u[1:2](t), start = 0) # thruster input
#
# # Specify the objective between the default waypoints
# @expression(m, Integ[i=1:length(tw)-1], ∫(u[1]^2 + u[2]^2, t, tw[i], tw[i+1], eval_method=GaussLegendre,num_supports=100))
# @objective(m, Min, Integ[1]+Integ[2]+Integ[3])
# ########CASE 4##########################################
# requires running InfiniteOptDefaultOCP.jl first
# Set the parameters, variables, and the objective - one integration, multiple waypoints
# xw = [x_opt[1]';x_opt[2]']
# tw = supports(t)
#
# @infinite_parameter(m, t in [0, 60],num_supports=61)
#
# @infinite_variable(m, x[1:2](t), start = 1) # position
# @infinite_variable(m, v[1:2](t), start = 0) # velocity
# @infinite_variable(m, u[1:2](t), start = 0) # thruster input
#
# @expression(m, Integ, ∫(u[1]^2 + u[2]^2, t, eval_method=GaussLegendre))
# @objective(m, Min, Integ)
#
######################## REST OF THE MODEL#####################################
# Set the initial conditions
@BDconstraint(m, initial_velocity[i = 1:2](t == 0), v[i] == 0)

# Hit all the waypoints
@BDconstraint(m, [i = 1:2, j = eachindex(tw)](t == tw[j]), x[i] == xw[i, j])

# Derivatives
@constraint(m, [i = 1:2], ∂(x[i], t) == v[i])
@constraint(m, [i = 1:2], ∂(v[i], t) == u[i])

# Optimize the model
optimize!(m)

# Get the results
if has_values(m)
    x_opt = value.(x)
end

# # Plot the results
scatter!(xw[1, :], xw[2, :], label = "Waypoints - quadrature")
plot!(x_opt[1], x_opt[2], label = "Trajectory - quadrature",ls=:dashdot,lw=4,lc = :blue)

Desktop:

I am attaching a document with all the plots I got ( InfiniteOpt_quadratures.pdf). I am not getting any errors when running the cases, but let me know if I should provide more information.

pulsipher commented 3 years ago

Thank you for reaching out and trying InfiniteOpt.jl! The behavior described above can be explained via how the time supports are managed in building the model, and how this behavior differs between using trapezoid rule and Gauss quadrature. In the above example this leads to certain time points not being penalized in the objective which allows large thrust values to occur at those instances leading to a result one would likely not expect. Let me illustrate this with the cases below, notice the added comments on how supports are added and used.

Default Model (using trapezoid rule):

using InfiniteOpt, JuMP, Ipopt

# Waypoints
xw = [1 4 6 1; 1 3 0 1] # positions
tw = [0, 25, 50, 60]    # times

# Initialize the model
m = InfiniteModel(optimizer_with_attributes(Ipopt.Optimizer, "print_level" => 0))

# Set the parameters and variables
@infinite_parameter(m, t in [0, 60], num_supports = 61) # adds 61 supports at [0, 1, ..., 60]
@infinite_variable(m, x[1:2](t), start = 1) 
@infinite_variable(m, v[1:2](t), start = 0)
@infinite_variable(m, u[1:2](t), start = 0)

# Specify the objective
@objective(m, Min, integral(u[1]^2 + u[2]^2, t)) # uses trapezoid rule which will incorporate all supports added before `optimize!`

# Set the constraints (this will incorporate all supports added before `optimize!` is invoked)
@BDconstraint(m, initial_velocity[i = 1:2](t == 0), v[i] == 0) # adds a support 0 to t if not already there (which it is in this case)
@constraint(m, [i = 1:2], deriv(x[i], t) == v[i])
@constraint(m, [i = 1:2], deriv(v[i], t) == u[i])
@BDconstraint(m, [i = 1:2, j = eachindex(tw)](t == tw[j]), x[i] == xw[i, j])

# Optimize the model (will make fresh transcription model formatted as a JuMP model and then solved)
optimize!(m)

# Plot results (not shown here for conciseness)

image Thus, invoking this and plotting obtains the trajectory we expect.

Adding Supports and Using Quadrature (leads to many ignored terms in the objective):

using InfiniteOpt, JuMP, Ipopt

# Waypoints
xw = [1 4 6 1; 1 3 0 1] # positions
tw = [0, 25, 50, 60]    # times

# Initialize the model
m = InfiniteModel(optimizer_with_attributes(Ipopt.Optimizer, "print_level" => 0))

# Set the parameters and variables
@infinite_parameter(m, t in [0, 60], num_supports = 61) # adds 61 supports at [0, 1, ..., 60]
@infinite_variable(m, x[1:2](t), start = 1) 
@infinite_variable(m, v[1:2](t), start = 0)
@infinite_variable(m, u[1:2](t), start = 0)

# Specify the objective using Gauss quadrature 
# Note this will only incorporate the support points it generates and will ignore the 61 supports added above
@objective(m, Min, integral(u[1]^2 + u[2]^2, t, eval_method = Quadrature)) # adds 10 Gauss quadrature nodes to the supports

println(num_supports(t)) # prints 71 (now our model has the 61 supports we added + the 10 Gauss quadrature ones)

# Set the constraints (this will incorporate all 71 supports)
@BDconstraint(m, initial_velocity[i = 1:2](t == 0), v[i] == 0) # would add a support 0 to t if not already there, but this has already been added
@constraint(m, [i = 1:2], deriv(x[i], t) == v[i])
@constraint(m, [i = 1:2], deriv(v[i], t) == u[i])
@BDconstraint(m, [i = 1:2, j = eachindex(tw)](t == tw[j]), x[i] == xw[i, j])

# Optimize the model (will make fresh transcription model formatted as a JuMP model and then solved)
optimize!(m)

# Plot results (not shown here for conciseness)

image Now we obtain a very different result, however it does satisfy all the constraints (including the derivatives). This v shape occurs because we are only penalized to use thrust on the 10 Gauss quadrature points [0.782804144485, 4.04809899933, ..., 59.2171958555] thus the thrust given at these instances is 0 and any amount is given at the other time points without penalty allowing the hovercraft to hit all the needed waypoints. Visually this is shown in the plot below. Notice that all the Gauss time points have 0 thrust and large amounts of thrust are used liberally at the other points. image

Only Use Quadrature Supports:

using InfiniteOpt, JuMP, Ipopt

# Define function to find the closest time point
function findnearest(a, x)
    if isempty(a)
        return
    end
    a = sort(a)
    idx = findfirst(v -> v >= x, a)
    if idx isa Nothing
        return last(a)
    elseif idx == 1
        return a[idx]
    elseif abs(a[idx] - x) <= abs(a[idx - 1] - x)
        return a[idx]
    else
        return a[idx - 1]
    end
end

# Waypoints
xw = [1 4 6 1; 1 3 0 1] # positions
tw = [0, 25, 50, 60]    # times (we will find the closest quadrature points further down)

# Initialize the model
m = InfiniteModel(optimizer_with_attributes(Ipopt.Optimizer, "print_level" => 0))

# Set the parameters and variables
@infinite_parameter(m, t in [0, 60]) # don't add any supports yet
@infinite_variable(m, x[1:2](t), start = 1) 
@infinite_variable(m, v[1:2](t), start = 0)
@infinite_variable(m, u[1:2](t), start = 0)

# Specify the objective using Gauss quadrature and specify we want 61 supports
# Note this will only incorporate the support points it generates and will ignore any others we add elsewhere
@objective(m, Min, integral(u[1]^2 + u[2]^2, t, eval_method = Quadrature, num_supports = 61)) # adds 61 Gauss quadrature nodes to the supports

# Set the constraints (this will incorporate all supports)
@BDconstraint(m, initial_velocity[i = 1:2](t == 0), v[i] == 0) # this adds a support of 0 to t (now we have 62 supports)
@constraint(m, [i = 1:2], deriv(x[i], t) == v[i])
@constraint(m, [i = 1:2], deriv(v[i], t) == u[i])
twc = [findnearest(supports(t), w) for w in tw] # match the waypoint times to the closest support 
@BDconstraint(m, [i = 1:2, j = eachindex(twc)](t == twc[j]), x[i] == xw[i, j])

# Optimize the model (will make fresh transcription model formatted as a JuMP model and then solved)
optimize!(m)

# Plot results (not shown here for conciseness)

image This works as we would intuitively expect since all the supports (except the first) are included in the objective integral. Excluding the first is ok since the initial condition does not correspond to a degree of freedom in the transcripted problem.

Pulling this all together, the takeaway is that quadrature can be used to evaluate integrals, but this should be done with caution knowing that any supports added to the model elsewhere will not be included in the integral approximation since quadrature exacts that the included supports be at particular points that are orthogonal to the polynomial basis. For this reason, we employ trapezoid rule by default since it can and will include all the supports that are added to the model.

I hope this clarifies your issues you were running into. Suggestions and/or contributions to make this behavior more clear in our documentation would be appreciated.

pulsipher commented 3 years ago

After some more thinking, there are 2 potential contributions that could be made to enable measures that use quadrature to consider additional user-specified supports (outside those exacted by the quadrature method):

  1. Add an option for such measures to be broken up into sub-intervals according to the other supports and then apply the appropriate quadrature methods to each sub-interval. (i.e., essentially treat the additional supports as finite elements)
  2. Implement an evaluation method for Gauss quadrature over multiple nodes as described in section 2.3 of https://www.emis.de/journals/ETNA/vol.13.2002/pp119-147.dir/pp119-147.pdf (this is more rigorous, but will require a lot of mathematical work to derive the needed weights and nodes).

Such contributions would automate the consideration of user-defined supports that are used in addition to quadrature nodes.

mzagorowska commented 3 years ago

Hi, thank you, the explanation makes sense. I got so focused on the derivatives that I lost the track of how the integral was calculated!

I agree with your second comment, because I am not exactly happy with the fact that I have to adjust the time points for the waypoints if I want to use the quadrature. I think the expected behaviour would be to keep the waypoints as defined by the user (like I really want the hovercraft to be in the position [4,3] exactly at time 25, not 25.0001 (or whatever the closest value)). I will have a look at that and the best way to implement this behaviour, thanks for the links (also tagging @LucianNita).

For now, though, I think it would be good to add an explanation in the documentation of how the supports are taken into account. I came up with a minimal example to show this behaviour, but didn't have time yet to describe it properly so I could make a pull request. The outline would be along the lines:

# Create a model, with one variable and an infinite parameter with a given number of supports
m = InfiniteModel()
@infinite_parameter(m, t in [0, 10],num_supports = 11)
@infinite_variable(m, u(t))
# Create an objective function with the default trapezoid integration
@objective(m,Min,integral(u^2, t))
# Get the transcribed model
build_optimizer_model!(m)
mBT1 = optimizer_model(m)

And here show how many supports there are, how the variable u is transcribed, and how the objective function of the transcribed model looks like:

supports(t) ###gives: [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]
transcription_variable(u)  ###VariableRef[u(support: 1), u(support: 2), u(support: 3), u(support: 4), u(support: 5), u(support: 6), u(support: 7), u(support: 8), u(support: 9), u(support: 10), u(support: 11)]
objective_function(mBT1) ###gives  0.5 u(support: 1)² + u(support: 2)² + u(support: 3)² + u(support: 4)² + u(support: 5)² + u(support: 6)² + u(support: 7)² + u(support: 8)² + u(support: 9)² + u(support: 10)² + 0.5 u(support: 11)²

All the supports are used in both the objective function and the transcribed variable. Then we can readjust the model:

set_objective_function(m,integral(u^2, t,eval_method=Quadrature))
build_optimizer_model!(m)
mBT2 = optimizer_model(m)

and show the same things agains: the number of supports, the transcription of u and the new objective function:

supports(t) ###gives: [0.0, 0.130467357414, 0.674683166555, 1.0, 1.6029521585, 2.0, 2.83302302935, 3.0, 4.0, 4.25562830509, 5.0, 5.74437169491, 6.0, 7.0, 7.16697697065, 8.0, 8.3970478415, 9.0, 9.32531683344, 9.86953264259, 10.0]
transcription_variable(u)  ###VariableRef[u(support: 1), u(support: 2), u(support: 3), u(support: 4), u(support: 5), u(support: 6), u(support: 7), u(support: 8), u(support: 9), u(support: 10), u(support: 11), u(support: 12), u(support: 13), u(support: 14), u(support: 15), u(support: 16), u(support: 17), u(support: 18), u(support: 19), u(support: 20), u(support: 21)]
objective_function(mBT1) ###gives  0.33335672154344104 u(support: 2)² + 0.7472567457529024 u(support: 3)² + 1.0954318125799103 u(support: 5)² + 1.346333596549982 u(support: 7)² + 1.4776211235737642 u(support: 10)² + 1.4776211235737642 u(support: 12)² + 1.346333596549982 u(support: 15)² + 1.0954318125799103 u(support: 17)² + 0.7472567457529024 u(support: 19)² + 0.33335672154344104 u(support: 20)²

Here the integral uses only a subset of supports whereas the infinite variables is transcribed using all of the supports. I also have a plot for that, but I am not exactly happy with it at the moment: image

This will not happen if we let the integral do all the supports and no supports are defined explicitly:

@infinite_parameter(m, t in [0, 10])
@infinite_variable(m, u(t))
@objective(m,Min,integral(u^2, t,eval_method=Quadrature))

Then we get the same supports for the infinite variable and the integral:

supports(t) ###gives: [0.130467357414, 0.674683166555, 1.6029521585, 2.83302302935, 4.25562830509, 5.74437169491, 7.16697697065, 8.3970478415, 9.32531683344, 9.86953264259]
transcription_variable(u)  ###VariableRef[u(support: 1), u(support: 2), u(support: 3), u(support: 4), u(support: 5), u(support: 6), u(support: 7), u(support: 8), u(support: 9), u(support: 10)]
objective_function(mBT1) ###gives  0.33335672154344104 u(support: 1)² + 0.7472567457529024 u(support: 2)² + 1.0954318125799103 u(support: 3)² + 1.346333596549982 u(support: 4)² + 1.4776211235737642 u(support: 5)² + 1.4776211235737642 u(support: 6)² + 1.346333596549982 u(support: 7)² + 1.0954318125799103 u(support: 8)² + 0.7472567457529024 u(support: 9)² + 0.33335672154344104 u(support: 10)²

Therefore, using quadratures other than trapezoid requires a bit of care if there are user-defined supports in the problem. This explanation would take care of the issue that I ran into with the optimal control example - there are user defined supports because of the waypoints, so using a quadrature other than trapezoid is not straightforward.

The explanation is quite a mouthful, so let me know what you think and I can work on including it in the documentation (without the plot, unless I come up with something nicer).

pulsipher commented 3 years ago

I am glad I could clear things up. (I'll edit the title of this issue as a question accordingly for others to find)

With regard to contributing a method(s) to account for user-defined supports, I agree that this is a worthwhile addition. Discussion on this further can be done in #86 where Tillmann has kindly provided the basic mathematical information for us.

With regard to the proposed documentation contribution, I think that would be a great addition! I suggest placing this contribution on the measure page within the "Evaluation Methods" section in its own subsection called "A Note on Support Management." The above draft looks like a good starting point to me. I would also suggest mentioning this behavior is also true when using MC sampling to evaluate an integral. Once you have a contribution ready (or close to ready), kindly make a pull request and we can continue the conversation there.

pulsipher commented 3 years ago

As an update, with #114 we now have a finite element Lobatto quadrature method FEGaussLobatto() that decomposes the integral over finite elements dictated by user-defined supports to avoid the confusion discussed above. This is also the new default for finite intervals when Quadrature() is specified.