Solver yields zero solution

jviereck commented 6 months ago

When running the SolverCSQP in simulation from a fixed starting position, I observe that the resulting cost is the same but the resulting trajectory is different. Also, I noticed that in 1 of 20 runs the solution is zero (the resulting xs / us trajectory is zero beside the initial configuration). See my code below.

Should the results from the solver for same problem and initial configuration be the same? Has someone else observed the zero result? My feeling is there is some memory that is accessed from C++ that was not initialized or is freed already.

Happy to share my setup. What I try to do is optimize a double pendulum to swing upwards and balance.

Let me know if this rings a bell.

Thank you!

Results from runs

Run 1 (zero output)

Note how the output does not include the information about the solver (cost, merit, ...).

time ms: 0.40841102600097656

array([[-0.044, -5.799, -1.667, -5.832,  0.   ],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.   ],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.   ],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.   ],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.   ],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.   ],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.   ],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.   ],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.   ],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.   ],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.   ]])

Run 2

time ms: 0.6632804870605469
iter     merit        cost       ||gaps||   ||Constraint||  ||(dx,du)||     step   KKT criteria  QP iters 
   1  2.0070e+02   1.6815e+01   1.2940e+01    5.4490e+00    7.28867e+01    0.0312      ----        100

array([[-0.044, -5.799, -1.667, -5.832, -0.118],
       [-0.011, -0.161, -0.915,  2.033, -0.171],
       [-0.033, -0.106, -2.233,  5.494, -0.083],
       [-0.062, -0.034, -2.884,  7.15 ,  0.047],
       [-0.087,  0.027, -2.547,  6.126,  0.076],
       [-0.107,  0.073, -2.01 ,  4.589,  0.045],
       [-0.125,  0.111, -1.734,  3.795,  0.033],
       [-0.14 ,  0.144, -1.57 ,  3.329,  0.093],
       [-0.15 ,  0.16 , -0.947,  1.617,  0.07 ],
       [-0.155,  0.165, -0.517,  0.442,  0.042],
       [-0.158,  0.163, -0.308, -0.122, -0.009]])

Run 3

time ms:iter     merit        cost       ||gaps||   ||Constraint||  ||(dx,du)||     step   KKT criteria  QP iters 
   1  2.0070e+02   1.6815e+01   1.2940e+01    5.4490e+00    2.81114e+02    0.0039      ----        100
 0.6461143493652344

array([[-0.044, -5.799, -1.667, -5.832,  0.044],
       [ 0.003, -0.031,  0.316, -0.843,  0.053],
       [ 0.01 , -0.052,  0.741, -2.042, -0.063],
       [ 0.013, -0.059,  0.258, -0.753, -0.066],
       [ 0.011, -0.053, -0.242,  0.579, -0.097],
       [ 0.001, -0.027, -0.988,  2.592, -0.069],
       [-0.015,  0.013, -1.525,  4.045, -0.   ],
       [-0.03 ,  0.054, -1.538,  4.079,  0.094],
       [-0.038,  0.075, -0.829,  2.149,  0.084],
       [-0.04 ,  0.08 , -0.206,  0.47 ,  0.029],
       [-0.041,  0.08 , -0.02 , -0.003, -0.014]])

Code

rmodel, collision_model, visual_model = pin.buildModelsFromUrdf(
    'modeling/pendulum.urdf', 'modeling/'
)
rdata = rmodel.createData()

state = crocoddyl.StateMultibody(rmodel)
actuation = crocoddyl.ActuationModelFull(state)
nu = actuation.nu

T = 11
dt = 0.01
x0 = np.zeros(4)

running_models = []
constraintModels = []
for t in range(T + 1):
    costModel = crocoddyl.CostModelSum(state, nu)

    stateWeights = np.array([10, 10, 1e-4, 1e-4])
    ctrlWeights = np.array([5e1, 5e10])

    stateResidual = crocoddyl.ResidualModelState(state, np.zeros(4), nu)
    stateActivation = crocoddyl.ActivationModelWeightedQuad(stateWeights**2)
    stateReg = crocoddyl.CostModelResidual(state, stateActivation, stateResidual)

    ctrlResidual = crocoddyl.ResidualModelControl(state, nu)
    ctrlActivation = crocoddyl.ActivationModelWeightedQuad(ctrlWeights**2)
    ctrlReg = crocoddyl.CostModelResidual(state, ctrlActivation, ctrlResidual)

    costModel.addCost("stateReg", stateReg, 1.)
    costModel.addCost("ctrlReg", ctrlReg, 5e-1)      

    constraints = crocoddyl.ConstraintModelManager(state, actuation.nu)

    stateBounds = np.array([np.inf, 0.35, np.inf, np.inf])
    constraints.addConstraint('stateBound', 
      crocoddyl.ConstraintModelResidual(state, stateResidual, -stateBounds, stateBounds))

    constraints.addConstraint('ctrlBound', 
      crocoddyl.ConstraintModelResidual(state, ctrlResidual, np.array([-0.15, -np.inf]), np.array([0.15, np.inf])))

    rmodel, collision_model, visual_model = pin.buildModelsFromUrdf(
    'modeling/pendulum.urdf', 'modeling/'
)
rdata = rmodel.createData()

state = crocoddyl.StateMultibody(rmodel)
actuation = crocoddyl.ActuationModelFull(state)
nu = actuation.nu

T = 11
dt = 0.01
x0 = np.zeros(4)

running_models = []
constraintModels = []
for t in range(T + 1):
    costModel = crocoddyl.CostModelSum(state, nu)

    stateWeights = np.array([10, 10, 1e-4, 1e-4])
    ctrlWeights = np.array([5e1, 5e10])

    stateResidual = crocoddyl.ResidualModelState(state, np.zeros(4), nu)
    stateActivation = crocoddyl.ActivationModelWeightedQuad(stateWeights**2)
    stateReg = crocoddyl.CostModelResidual(state, stateActivation, stateResidual)

    ctrlResidual = crocoddyl.ResidualModelControl(state, nu)
    ctrlActivation = crocoddyl.ActivationModelWeightedQuad(ctrlWeights**2)
    ctrlReg = crocoddyl.CostModelResidual(state, ctrlActivation, ctrlResidual)

    costModel.addCost("stateReg", stateReg, 1.)
    costModel.addCost("ctrlReg", ctrlReg, 5e-1)      

    constraints = crocoddyl.ConstraintModelManager(state, actuation.nu)

    stateBounds = np.array([np.inf, 0.35, np.inf, np.inf])
    constraints.addConstraint('stateBound', 
      crocoddyl.ConstraintModelResidual(state, stateResidual, -stateBounds, stateBounds))

    constraints.addConstraint('ctrlBound', 
      crocoddyl.ConstraintModelResidual(state, ctrlResidual, np.array([-0.15, -np.inf]), np.array([0.15, np.inf])))

    if t == T:
        xTarget = np.array([0.150, 0.150, 0.5, 0.5])
        constraints.addConstraint('endStateBound', crocoddyl.ConstraintModelResidual(state, stateResidual, -xTarget, xTarget))

    dmodel = crocoddyl.DifferentialActionModelFreeFwdDynamics(state, actuation, costModel, constraints)
    model = crocoddyl.IntegratedActionModelEuler(dmodel, dt)

    running_models += [model]

problem = crocoddyl.ShootingProblem(x0, running_models[:-1], running_models[-1])
solver = mim_solvers.SolverCSQP(problem)
solver.termination_tolerance = 1e-4
solver.with_callbacks = True 

xs = [np.array([0, 0, 0, 0])] * (T + 1)
us = [np.zeros(2)] * T

solver.problem.x0 = np.array([-0.044, -5.799, -1.667, -5.832])
tic = time.time()
solver.max_qp_iters = 100
solver.solve(xs, us, 1)

print('time ms:', 1000 * (time.time() - tic))

xs = solver.xs
us = np.array(solver.us)

plotPendulum(xs)
np.hstack([xs[:-1], us[:, :1]])
        xTarget = np.array([0.150, 0.150, 0.5, 0.5])
        constraints.addConstraint('endStateBound', crocoddyl.ConstraintModelResidual(state, stateResidual, -xTarget, xTarget))

    dmodel = crocoddyl.DifferentialActionModelFreeFwdDynamics(state, actuation, costModel, constraints)
    model = crocoddyl.IntegratedActionModelEuler(dmodel, dt)

    running_models += [model]

problem = crocoddyl.ShootingProblem(x0, running_models[:-1], running_models[-1])
solver = mim_solvers.SolverCSQP(problem)
solver.termination_tolerance = 1e-4
solver.with_callbacks = True 

xs = [np.array([0, 0, 0, 0])] * (T + 1)
us = [np.zeros(2)] * T

solver.problem.x0 = np.array([-0.044, -5.799, -1.667, -5.832])
tic = time.time()
solver.max_qp_iters = 100
solver.eps_aps=1e-6
solver.eps_rel=0
solver.solve(xs, us, 1)

print('time ms:', 1000 * (time.time() - tic))

xs = solver.xs
us = np.array(solver.us)

print(np.hstack([xs[:-1], us[:, :1]]))

avadesh02 commented 6 months ago

Hi @jviereck, Could you share a basic setup of the code that can reproduce your bug (the urdf file you are using and the entire example file you posted above)? I can try to investigate the issue.

jviereck commented 5 months ago

Hi @avadesh02 , thank you for your response - sorry, I was on holidays and my reply delayed.

I played around a bit with the setup. I found that the default DDP/BoxDDP solver from crocoddyle also yields different results for the same input. My feeling is therefore that this problem is due to my setup and less due to your solver.

Closing this issue for now as the problem seems to be independent from the mim_solver package.

machines-in-motion / mim_solvers