airo-ugent / airo-mono

Python packages for robotic manipulation @ IDLab AI & Robotics Lab - UGent - imec
https://airo.ugent.be
MIT License
14 stars 1 forks source link

Robot Program / Controller Abstractions: Should airo-mono offer high-level abstractions for robot programs? #140

Open Victorlouisdg opened 4 months ago

Victorlouisdg commented 4 months ago

Describe the feature you'd like Airo-mono has enabled code sharing for many low and mid-level tasks (e.g. hardware access and basic operations). Should airo-mono also offer standard high-level abstractions for common robot program paradigms:

Design goals 📝

Possible implementation Abstract base classes that define the methods that their children must implement. For example:

🧠 SensePlanActController:

This kind of structure maps very naturally to all the controllers I've written for the cloth-competition.

Victorlouisdg commented 4 months ago

Some considerations for each paradigm:

m-decoster commented 4 months ago

For our demo, we had a different vision with controllers that are more low level, but I see merit in both of our proposals (mine below).

Main issues with our old architecture

Proposed solutions

I saw controllers as stateless functions, operating on a State, changing the physical state of the world, and returning a ControllerResult that can impact the application flow (e.g., triggering a perception update). However, this may be too restrictive if we want to do some sensing inside a controller. These functions can be composed quite easily, and higher-level controllers are composed of lower level controllers, typically with a very easy-to-understand code flow (just some function calls). An example is given at the bottom of this comment.

Considerations

Conclusion

I think there are several levels of abstraction and we need to decide which we want to address, and how.

Code example

@controller
def grab_capsule(controller_arguments: ControllerArguments) -> ControllerResult:
    """Grab a capsule from the dispenser.
    Args:
        controller_arguments: Controller arguments.

    Returns:
        A ControllerResult."""
    # The dispenser's position is hard coded, so we can use hard coded joint configurations.
    # NOTE: eventually we might want to make values such as this configurable as a data file.

    # Planned to.
    q_pregrasp = np.array([2.39238167, -1.32466565, 0.13710148, 4.24871032, -1.52691394, np.pi])
    # Moved to.
    q_grasp = np.array([2.39213324, -1.0258608, 0.4087413, 3.84247033, -1.54018432, np.pi])

    result = plan_to_joint_configuration(controller_arguments, q_pregrasp)
    if not result.success:
        return result

    result = move_freely_to_joint_configuration(controller_arguments, q_grasp)
    if not result.success:
        return result

    result = move_gripper(controller_arguments, 0.005, Robotiq2F85.ROBOTIQ_2F85_DEFAULT_SPECS.min_speed, Robotiq2F85.ROBOTIQ_2F85_DEFAULT_SPECS.min_force)
    if not result.success:
        return result

    return back_up_from_capsule_dispenser(controller_arguments)    
Victorlouisdg commented 4 months ago

I feel like your proposal is close to what I had in mind for an 🕹️ AgentController. If I understand correctly that would look something like this:

station = Station()  # contains the hardware
observer = Observer(station) # also contains Yolo or SAM? 

while not result == done:
  observation = observer.get_observation()  # joint configs, images, point cloud, YOLO detections?
  result = grab_capsule(station.arm, observation)

while not result == done:
  observation = observer.get_observation()
  result = move_to_coffee_maker(station.arm, observation)

...

The Station and Observer together function like an RL Environment, updating the state and providing observations. The grab_capsule function is basically an Agent, except that it is responsible for both decision-making and action execution (as opposed to an RL agent with returns predefined actions).

m-decoster commented 4 months ago

Yes, this is very similar to what I had in mind.

The SensePlanActController could also work very well in the coffee demo, but:


Here, I'm copying an issue from the barista repository to have all thoughts in one place:

Problems with current code

For example, the LeverOpenerController is a monster of a class that violates, a.o., the single responsibility principle. It maintains its own planner, performs perception and computes bounding boxes, moves based on motion planning, with servoing and also simply with regular MoveJ commands.

Proposed code architecture

I think the following code architecture would be more maintainable, but it could also be too restrictive.

General idea

m-decoster commented 2 weeks ago

For the centrifuge demo we used a SensePlanActController (SenseThinkActController) which worked well for us.

In case a controller did not need to sense, think, or act, we simply left the specific method empty.

There were some cases where, due to working against a deadline, we started to interleave sensing, thinking, and acting, but this can easily be avoided by being more strict (and code review).

I enjoyed the separation of sense, think, and act, because especially during development, we could check trajectories and poses in simulation (especially with airo-drake) before running.

However, I'm not sure if we should really supply such interfaces. I think it would be better to document different controller styles somewhere in a "recommended practices" document.