Lifelong-Robot-Learning / LIBERO

Benchmarking Knowledge Transfer in Lifelong Robot Learning
MIT License
255 stars 39 forks source link

Issues with Playback #16

Open dibyaghosh opened 8 months ago

dibyaghosh commented 8 months ago

I'm having a hard time reproducing (by playback) trajectories in the dataset (I'm interested in generating / logging some more details) -- they mostly look correct, but in some cases, the replayed trajectory "fails" while the original logged trajectory succeeds

I've been trying to follow scripts/create_dataset.py to replicate exactly how the files should be replayed, but I've been finding that almost always, the new states are at least 0.01 apart from those logged, and for a nontrivial number of states, it goes up >=1

I'm wondering

1) if there could be some dependency on a specific Mujoco version that was used to collect the data or 2) there could be the need for an off-by-one or something like that when replaying? 3) if there's some burn-in no-actions that need to be taken at the beginning of an episode (to warm up the controller)?

    states = f["data/{}/states".format(ep)][()]
    actions = np.array(f["data/{}/actions".format(ep)][()])
    init_idx = 0

    env.reset_from_xml_string(model_xml)
    env.sim.reset()
    env.sim.set_state_from_flattened(states[init_idx])
    env.sim.forward()
    all_obs = []
    all_actions = []
    n_errors = 0
    for j, action in enumerate(data["data"][ep]["actions"]):
        obs, reward, done, info = env.step(action)
        if j < len(actions) - 1:
            # ensure that the actions deterministically lead to the same recorded states
            state_playback = env.sim.get_state().flatten()
            # assert(np.all(np.equal(states[j + 1], state_playback)))
            err = np.linalg.norm(states[j + 1] - state_playback)
[warning] playback diverged by 0.24 for ep demo_4 at step 0
[warning] playback diverged by 1.29 for ep demo_4 at step 45
[warning] playback diverged by 0.22 for ep demo_4 at step 46
[warning] playback diverged by 1.02 for ep demo_4 at step 47
[warning] playback diverged by 2.04 for ep demo_4 at step 48
Cranial-XIX commented 7 months ago

The physics might be a bit different on different machines. If you want to replay data, you can directly reset to the sim state instead of replaying action sequences.

There is an initial burn-in no actions during evaluation with null actions, this is just for stabilizing the physics (e.g., sometimes the objects are not perfectly aligned, so null actions will let everything stabilize).

Friedrich-M commented 2 months ago

I wonder how to replay the data given the action and state from the demo file. I want to align the replay observation with the pre-generated observation in the demo file.

for view in views:
    rgb = np.array(demos[demo_k]['obs'][f'{view}_rgb'])
    rgb = rgb[:, ::-1, :, :].copy()
    rgb_demo[view] = rgb

states = demos[demo_k]['states'][()]
actions = demos[demo_k]['actions'][()]
for index, (state, action) in enumerate(zip(states, actions)):
    obs = env.set_init_state(state) 
    for replay_view in camera_names:
        rgb_map = obs[f'{replay_view}_image']
        rgb_replay[replay_view].append(rgb_map)

I try the code above to match the rgb_replay with rgb_demo. Is that correct?

Friedrich-M commented 2 months ago

@Cranial-XIX Could you kindly provide some guidance? Thanks.

Cranial-XIX commented 2 months ago

Please check our https://github.com/Lifelong-Robot-Learning/LIBERO/blob/master/notebooks/quick_walkthrough.ipynb notebook, basically you set_init_state once to the starting state, then simulate the action, then record the observation.

Friedrich-M commented 2 months ago

Please check our https://github.com/Lifelong-Robot-Learning/LIBERO/blob/master/notebooks/quick_walkthrough.ipynb notebook, basically you set_init_state once to the starting state, then simulate the action, then record the observation.

Thanks for your help!

Friedrich-M commented 2 months ago

@Cranial-XIX Sorry to bother you again.

image

I find that the file you give directly reads RGB observation from pre-generated example demo files. However, I want to replay the simulation process to get more observation data like depth maps based on LIBERO bbdl file, and make the generated RGB data align with the original demo observations.

So I wonder if there is a script to do that? If not, could you provide some specific guidance about replaying? Thanks a lot.

TousakaNagio commented 1 month ago

Hi @Friedrich-M,

I am wondering have you solved your problem? I am facing the same issue.