Add in some small number of samples around zero-mean

After talking to @tkkim-robot about the issues wrt #80 (due to his expertise as the first-author of the SMPPI paper), he mentioned in passing adding some zero-mean samples into our optimization problem. With more research, I think this is a good idea.

Algorithm 1 on page 8 of this paper introduces the idea of having some trajectories of just noise values without adding in the previous optimal control.

This enables the IT-MPC algorithm to reset itself if a disturbance destroys the effectiveness of the previously computed control sequence

Basically, if we get in a bad spot, some zero-mean samples can help reset the optimization problem softly to continue making progress. We have a heavier handed version of this where we reset the optimization problem when all of the samples are in collision, but won't until all are in collision. This helps potentially for the bad-but-not-terrible case where many are in collision but not strictly all.

Additionally, on occasion I see this problem:

MPPI Replan

Where we go down the "wrong" path solution space because we're going full-speed towards 1 goal and it was changed to another direction. Noises from the previous optimal trajectory won't have the robot slow and stop, because it can reach around the pillar in another solution space to try to follow the path.

I think something like this could remove that behavior, or at least have it only when it truly is just better by providing some options near the robot to follow the path more directly. I am concerned about feasibility of this kind of thing though - we'd be going from full speed potentially to 0. We have velocity smoothers below the controllers to help smooth out infeasible requests, but full-speed to zero (or even reversing) would be an extreme case and may require several smoothed-time steps for the robot to comply with MPPI's request. I suppose that happens right now when we do a hard reset, but those are few and far between and analog to a critical failure mode.

This is easy to implement, but don't want to play with it until after #80 is resolved (1 thing at a time). I suspect with all of our cost functions we may need to play with this a bit to make a hand-off between them in the right situations smooth.

Done and high level testing shows that it works pretty well without having to trigger the full optimizer recovery https://github.com/artofnothingness/mppic/tree/zero_mean

I think more characterization on hardware could be beneficial before merging, but this seems like a good idea. Not strictly required, but definitely doesn't hurt. Its hard to tell how much "cleaner" this soft recovery is over our hard recovery. I'd like to visually see this happening in person on the hardware to make sure this is actually a better outcome (though I suspect it is). We should keep the hard recovery in place either way for minor system-level failures that can be recovered from

Opening a draft PR for visibility #97

artofnothingness / mppic

Add in some small number of samples around zero-mean #96