MIT-REALM / neural_clbf

Toolkit for learning controllers based on robust control Lyapunov barrier functions
BSD 3-Clause "New" or "Revised" License
124 stars 43 forks source link

Implement Cartpole #26

Open dtch1997 opened 1 year ago

dtch1997 commented 1 year ago

Hi there @dawsonc , I've got a mostly functional version of Cartpole which I believe would be a nice addition to this repo, and I would like to request some help with merging it in.

I have implemented Cartpole following these equations of motion, which are also used by the OpenAI gym implementation.

After some algebraic manipulation, I have derived the control-affine form of the equations implemented in neural_clbf/systems/cartpole.py. Running the file checks that the closed-loop dynamics derived from these equations is identical to the full dynamics for several randomly-sampled states and controls.

I have implemented an associated training script, however I invariably get issues with infeasible QPs as follows:

$ python neural_clbf/training/train_cartpole.py --n-epochs 1
...
    raise SolverError("Solver scs returned status %s" % status)
diffcp.cone_program.SolverError: Solver scs returned status infeasible

I'm struggling to figure out the cause and a bit of advice would be much appreciated. Thanks very much!

dtch1997 commented 1 year ago

Reducing the state limits seems to have fixed it somewhat. Experiment ongoing, viewable at https://wandb.ai/dtch1997/NeuralCLBF/runs/th3qiqh6?workspace=user-dtch1997

dawsonc commented 1 year ago

Thanks for your interest in contributing to the project!

I'm happy to support merging this system in, but I'd first want to see some evidence that you can get training working with the new model (for at least one of the types of neural certificate, e.g. neural cbf or clbf, and bonus points for extra demonstrations). For posterity, would you mind attaching the plots and simulation results showing successful training to this PR (when available).

PS I get a 404 at that link you provided.

dtch1997 commented 1 year ago

Thanks for getting back to me!

Sorry about the messy state of the PR. I went somewhat overboard trying various changes to improve performance which didn't really pan out.

I'll revert to the working version and fix the WandB link soon.