vikashplus / robohive

A unified framework for robot learning
https://sites.google.com/view/robohive
Apache License 2.0
489 stars 82 forks source link

Roadmap? #116

Closed Kaixhin closed 10 months ago

Kaixhin commented 10 months ago

This looks like an amazing framework for robot learning, with many considerations about various things that we need to worry/care about. I would like my team to use this, but it seems like there are still parts which are WIP, so it's tricky to commit to without knowing how the development team is planning to go. Would you be willing to put up a roadmap on the wiki?

For example, right now we use PerAct, which uses voxels (via point clouds) and motion planning, so point cloud and motion planning support would not only allow us to reimplement PerAct in RoboHive, but it would also open up a lot of other robot learning algorithms. In a similar vein, is MuJoCo 3 somewhere on the roadmap (allowing non-convex geometries and deformable objects)?

vikashplus commented 10 months ago

Thanks, @Kaixhin for your interest in RoboHive.

RoboHive is the default repo which has supported most publications from my group. We have been building RoboHive since 2018 and it will keep evolving over time. In some sense it will always remain in WIP as this is our primary platform for ongoing research. Moving forward you can expect the platform to mature and further diversifying to accommodate evolving robot learning needs.

Kaixhin commented 10 months ago

Thanks a lot for the context. If my team goes forward with RoboHive then I'd definitely love for us to contribute back. I noticed a Slack link on the README where I assume discussion is happening - is that going to be opened up for the public (invite-only perhaps)?

Final question on long-term plans - do you envisage adding LLM planners? Obviously they are of growing interest in the robot learning community, and could be useful for some envs already here, like RelayKitchen. Given your research focus, I assume not, but just wanted to check.

Considering the structure below and the codebase, it seems like "Foundation Models" are treated as observation encoders, that exist as "pre-processors" for tasks, but I see them as more broadly existing on a "post-processor" side, feeding "foundation-model-extracted contextual info" to the agents (use-cases for human agents seem limited but could exist). What are your thoughts?

274477685-49fc1b7b-18da-465d-9894-f4a4e82f1302

vikashplus commented 10 months ago

RoboHive is a collection of environments. The primary focus will remain task/suite collection. These tasks will be compatible with OpenAI Gym API. Any agent's framework that supports this API can be easily used RoboHive. See here for a collection of Agents. One can follow the examples here to integrate and LLM planner.

IIUC, you are envisioning a foundation model support for Agents. If this is the case, then the dependency should be a part of the Agents repo, not RoboHive. RoboHive exposes low-level access to task details for the agent to reason in unison with foundation models.

Please let me know if I misunderstood your point.

Kaixhin commented 10 months ago

Thanks for the explanation. I think I was getting confused between "RoboHive" and the "RoboHive ecosystem", which includes AgentHive. In that case, maybe it would be helpful to add a pointer in the README of this repo to say that if people are interested in training agents on RoboHive envs then anything Gym-compatible is fine, but you have also built AgentHive to facilitate this?

vikashplus commented 10 months ago

That's a great suggestions. We will add it right away.