`pick_ik` is 10x slower than `bio_ik`

I finally did some benchmarking thanks to @Robotawi's help, and unfortunately the results are not encouraging.

Turns out that pick_ik's use of MoveIt's RobotState::updateLinkTransforms() + RobotState::getGlobalLinkTransform() for forward kinematics is 10x slower than the caching mechanism applied in bio_ik -- and that contributes to basically all the slowdowns in the benchmark.

https://github.com/ros-planning/moveit2/blob/main/moveit_core/robot_state/src/robot_state.cpp#L377-L379

We should fix this by looking at the tricks employed by bio_ik and improving MoveIt's RobotState to contain these capabilities!

One key thing that will need to happen here if we enable caching in RobotState is we can't just reuse the same RobotState instance for all the FK function calls. Basically, each memetic thread and local gradient descent call should use its own state so that the caching actually pays off. We'll also need to copy over from bio_ik the notion of a "link schedule" where the perturbations we evaluate first should happen at the tip of the kinematic chain(s) to minimize link pose recomputes.

PickNikRobotics / pick_ik

`pick_ik` is 10x slower than `bio_ik` #60