Closed quinnliu closed 3 years ago
Sorry, I'm not sure exactly what is being asked -- are you asking if you can receive image observations from the environment for a policy to generate actions from? If so, we already have that functionality enabled -- please see this guide for more info (slightly outdated in terms of naming specifics, but the high-level infra is consistent)
Sorry for the confusion @cremebrule
BTW thanks so much for the help thus far :)
To be more specific I'm wondering if I can process the RGB rendered image of the 3D environment at every timestep as part of my input to generate the newest action
array variable
For example:
from robosuite.models import MujocoWorldBase
print("Start Print Log -------------------------------------------------------")
world = MujocoWorldBase()
from robosuite.models.robots import Panda
mujoco_robot = Panda()
from robosuite.models.grippers import gripper_factory
gripper = gripper_factory('PandaGripper')
mujoco_robot.add_gripper(gripper)
mujoco_robot.set_base_xpos([0, 0, 0])
world.merge(mujoco_robot)
from robosuite.models.arenas import TableArena
mujoco_arena = TableArena()
mujoco_arena.set_origin([0.8, 0, 0])
world.merge(mujoco_arena)
from robosuite.models.objects import BallObject, BoxObject
import random
for i in range(12):
print("i = ", i)
w = 0.01
h = 0.01
l = 0.04
if i % 2 == 0:
h = 0.04
l = 0.01
current_brick = BoxObject(
name="brick"+str(i),
size=[w, h, l],
rgba=[0, 0.5, 0.5, 1]
).get_obj()
# generate scattered random position for bricks
# '0.8 0 1' is the center @ the table when robot arm base
# is at 12 o'clock with +x down && +y right
x = random.uniform(0.75, 0.85)
y = random.uniform(-0.05, 0.05)
z = " 1"
position = str(x) + " " + str(y) + z
print("position = ", position)
current_brick.set('pos', position)
world.worldbody.append(current_brick)
model = world.get_model(mode="mujoco_py")
from mujoco_py import MjSim, MjViewer
sim = MjSim(model)
viewer = MjViewer(sim)
viewer.vopt.geomgroup[0] = 0 # disable visualization of collision mesh
for i in range(10000):
# get camera_1 output in current time step to process to generate current correct action
1080x1080_pixel_RGB_POV_1 = viewer.get_camera_1() # <======== Anything like this exist?
# get camera_2 output in current time step
1080x1080_pixel_RGB_POV_2 = viewer.get_camera_2() # <======== Anything like this exist?
action = visual_cognitive_computer.see_and_generate_action(1080x1080_pixel_RGB_POV_1, 1080x1080_pixel_RGB_POV_2)
# action = [1,2,3,4,5,6,7,8,9]
# print("action = ", action)
# example real action = [ 0.39813241 0.90703087 0.97952504 -0.07412093 1.05065751 0.96402113
# -1.4640924 0.57175383]
sim.data.ctrl[:] = action
sim.step()
viewer.render()
NVM: I figured it out as you can pass arguments to sim.render(...)
To generate the array of 8 numbers stored as the variable
actions
for the Panda 8 DOF robotic arm to move towards a high level goal I'm curious how can one generate a visual input camera inside the environment simulation?In the real world this is required for the Panda arm to move correctly but I'm confused as to how to do this in the simulation. Thanks!