quinnliu commented 3 years ago

To generate the array of 8 numbers stored as the variable actions for the Panda 8 DOF robotic arm to move towards a high level goal I'm curious how can one generate a visual input camera inside the environment simulation?

In the real world this is required for the Panda arm to move correctly but I'm confused as to how to do this in the simulation. Thanks!

cremebrule commented 3 years ago

Sorry, I'm not sure exactly what is being asked -- are you asking if you can receive image observations from the environment for a policy to generate actions from? If so, we already have that functionality enabled -- please see this guide for more info (slightly outdated in terms of naming specifics, but the high-level infra is consistent)

quinnliu commented 3 years ago

Sorry for the confusion @cremebrule

BTW thanks so much for the help thus far :)

To be more specific I'm wondering if I can process the RGB rendered image of the 3D environment at every timestep as part of my input to generate the newest action array variable

For example:

from robosuite.models import MujocoWorldBase

print("Start Print Log -------------------------------------------------------")
world = MujocoWorldBase()

from robosuite.models.robots import Panda

mujoco_robot = Panda()

from robosuite.models.grippers import gripper_factory

gripper = gripper_factory('PandaGripper')
mujoco_robot.add_gripper(gripper)

mujoco_robot.set_base_xpos([0, 0, 0])
world.merge(mujoco_robot)

from robosuite.models.arenas import TableArena

mujoco_arena = TableArena()
mujoco_arena.set_origin([0.8, 0, 0])
world.merge(mujoco_arena)

from robosuite.models.objects import BallObject, BoxObject

import random

for i in range(12):
    print("i = ", i)
    w = 0.01
    h = 0.01
    l = 0.04
    if i % 2 == 0:
        h = 0.04
        l = 0.01
    current_brick = BoxObject(
        name="brick"+str(i),
        size=[w, h, l],
        rgba=[0, 0.5, 0.5, 1]
    ).get_obj()

    # generate scattered random position for bricks
    # '0.8 0 1' is the center @ the table when robot arm base
    # is at 12 o'clock with +x down && +y right
    x = random.uniform(0.75, 0.85)
    y = random.uniform(-0.05, 0.05)
    z = " 1"
    position = str(x) + " " + str(y) + z
    print("position = ", position)
    current_brick.set('pos', position)
    world.worldbody.append(current_brick)

model = world.get_model(mode="mujoco_py")

from mujoco_py import MjSim, MjViewer

sim = MjSim(model)
viewer = MjViewer(sim)
viewer.vopt.geomgroup[0] = 0 # disable visualization of collision mesh

for i in range(10000):
    # get camera_1 output in current time step to process to generate current correct action
    1080x1080_pixel_RGB_POV_1 = viewer.get_camera_1() # <======== Anything like this exist?

    # get camera_2 output in current time step
    1080x1080_pixel_RGB_POV_2 = viewer.get_camera_2() # <======== Anything like this exist?

    action = visual_cognitive_computer.see_and_generate_action(1080x1080_pixel_RGB_POV_1, 1080x1080_pixel_RGB_POV_2)

    # action = [1,2,3,4,5,6,7,8,9]
    # print("action = ", action)
    # example real action =  [ 0.39813241  0.90703087  0.97952504 -0.07412093  1.05065751  0.96402113
    # -1.4640924   0.57175383]
    sim.data.ctrl[:] = action
    sim.step()
    viewer.render()

quinnliu commented 3 years ago