Open Victorlouisdg opened 4 months ago
Some considerations for each paradigm:
For our demo, we had a different vision with controllers that are more low level, but I see merit in both of our proposals (mine below).
State
/Station
/WorldModel
; for return types: ControllerResult
) to increase reusability.I saw controllers as stateless functions, operating on a State
, changing the physical state of the world, and returning a ControllerResult
that can impact the application flow (e.g., triggering a perception update). However, this may be too restrictive if we want to do some sensing inside a controller. These functions can be composed quite easily, and higher-level controllers are composed of lower level controllers, typically with a very easy-to-understand code flow (just some function calls). An example is given at the bottom of this comment.
BoundingBox3DType
... A airo-rerun package might be useful?I think there are several levels of abstraction and we need to decide which we want to address, and how.
@controller
def grab_capsule(controller_arguments: ControllerArguments) -> ControllerResult:
"""Grab a capsule from the dispenser.
Args:
controller_arguments: Controller arguments.
Returns:
A ControllerResult."""
# The dispenser's position is hard coded, so we can use hard coded joint configurations.
# NOTE: eventually we might want to make values such as this configurable as a data file.
# Planned to.
q_pregrasp = np.array([2.39238167, -1.32466565, 0.13710148, 4.24871032, -1.52691394, np.pi])
# Moved to.
q_grasp = np.array([2.39213324, -1.0258608, 0.4087413, 3.84247033, -1.54018432, np.pi])
result = plan_to_joint_configuration(controller_arguments, q_pregrasp)
if not result.success:
return result
result = move_freely_to_joint_configuration(controller_arguments, q_grasp)
if not result.success:
return result
result = move_gripper(controller_arguments, 0.005, Robotiq2F85.ROBOTIQ_2F85_DEFAULT_SPECS.min_speed, Robotiq2F85.ROBOTIQ_2F85_DEFAULT_SPECS.min_force)
if not result.success:
return result
return back_up_from_capsule_dispenser(controller_arguments)
I feel like your proposal is close to what I had in mind for an 🕹️ AgentController
. If I understand correctly that would look something like this:
station = Station() # contains the hardware
observer = Observer(station) # also contains Yolo or SAM?
while not result == done:
observation = observer.get_observation() # joint configs, images, point cloud, YOLO detections?
result = grab_capsule(station.arm, observation)
while not result == done:
observation = observer.get_observation()
result = move_to_coffee_maker(station.arm, observation)
...
The Station
and Observer
together function like an RL Environment
, updating the state and providing observations. The grab_capsule
function is basically an Agent
, except that it is responsible for both decision-making and action execution (as opposed to an RL agent with returns predefined actions).
Yes, this is very similar to what I had in mind.
The SensePlanActController
could also work very well in the coffee demo, but:
self._plan = None
. IMO, these should be mutable values with immutable contents (e.g., frozen dataclasses)Here, I'm copying an issue from the barista repository to have all thoughts in one place:
For example, the LeverOpenerController is a monster of a class that violates, a.o., the single responsibility principle. It maintains its own planner, performs perception and computes bounding boxes, moves based on motion planning, with servoing and also simply with regular MoveJ
commands.
I think the following code architecture would be more maintainable, but it could also be too restrictive.
For the centrifuge demo we used a SensePlanActController
(SenseThinkActController
) which worked well for us.
In case a controller did not need to sense, think, or act, we simply left the specific method empty.
There were some cases where, due to working against a deadline, we started to interleave sensing, thinking, and acting, but this can easily be avoided by being more strict (and code review).
I enjoyed the separation of sense, think, and act, because especially during development, we could check trajectories and poses in simulation (especially with airo-drake) before running.
However, I'm not sure if we should really supply such interfaces. I think it would be better to document different controller styles somewhere in a "recommended practices" document.
Describe the feature you'd like Airo-mono has enabled code sharing for many low and mid-level tasks (e.g. hardware access and basic operations). Should airo-mono also offer standard high-level abstractions for common robot program paradigms:
SensePlanActController
: fits Victor’s use cases very well.BehaviorTreeController
: maybe relevant for mobile robotAgentController
: RL-like controller with env/obs/reward -> actionDesign goals 📝
StereoRGBDCamera
and anUR5e
and a crumpled cloth on the table) This fosters reusability across different setups.Possible implementation Abstract base classes that define the methods that their children must implement. For example:
🧠
SensePlanActController
:sense()
: collect an observation and save it in self._observationplan(observation)
: attempt to find a feasible plan and save in self._plan (uses self._observation if no observation was given)act(plan)
execute a plancan_act()
loop()
startsense()
->plan()
->act()
loop until a feasible plan is found or controller wants to terminate.visualize_observation()
visualize_plan()
execute(autonomous: bool = False)
This kind of structure maps very naturally to all the controllers I've written for the cloth-competition.