cheryyunl / Make-An-Agent

MIT License
22 stars 0 forks source link

Issue during evaluation #2

Closed jayaramreddy10 closed 1 month ago

jayaramreddy10 commented 1 month ago

Hi,

Congrats on the great work and thanks for releasing codebase. I was able to train all the 3 phases (encoder, behaviour embedding and diffusion model) but I am not able to evaluate those checkpoints. I want to evaluate them on an unseen task (coffee button)

Command run: python eval.py in PolicyGenerator folder. Error: env = metaworld.envs.ALL_V2_ENVIRONMENTS_GOAL_OBSERVABLEenv_name KeyError: '-v2-goal-observable'

Eval configs in config.yaml of PolicyGenerator: eval: env_name: '' ckpt_dir: '/home/jayaram/Make-An-Agent/ckpts/model-best.torch' encoder_dir: '/home/jayaram/Make-An-Agent/ckpts/autoencoder.ckpt' data_dir: '/home/jayaram/Make-An-Agent/Make-An-Agent_dataset/test_data/unseen/processed/coffee-button.pt'

Am i providing the inputs correctly? If yes, how do I resolve this issue?

Regards Jayaram

cheryyunl commented 1 month ago

Hi, this seems to be a problem caused by inconsistent Metaworld environment names. What is your Metaworld version? You could follow this instruction https://github.com/XuGW-Kevin/DrM to install the old version Metaworld. I will test this bug today Thanks

jayaramreddy10 commented 1 month ago

I am able to run the eval script with the above version (for door-close env) after providing the env name in config file but there seems to be small issue.

File "/home/jayaram/miniconda3/envs/makeagent/lib/python3.8/site-packages/hydra/core/utils.py", line 160, in run_job ret.return_value = task_function(task_cfg) File "eval.py", line 151, in main workspace.rollout(data, env=env_name) File "eval.py", line 96, in rollout gen_avg_reward_list.sort(reverse=True) TypeError: sort() got an unexpected keyword argument 'reverse'

Code was running fine after changing the sort as below:

gen_avg_reward_list = np.sort(gen_avg_reward_list)[::-1] gen_avg_success_list = np.sort(gen_avg_success_list)[::-1] gen_avg_success_time_list = np.sort(gen_avg_success_time_list)[::-1]

Below are the results I see for door-close task:

Avg. Reward: 30.42, Avg. Success: 0.0, Avg Length: 500.0 Before generation, Max Reward: 36.89, Min Episode Length: 500.0 After Generated, Avg. Reward: 653.04, Avg. Success: 0.08, Avg Length: 468.75 After generation, Max Reward: 4522.41, Min Episode Length: 52.0 Generated Top 5 Reward: 4370.91, Top 10 Reward: 3758.18 Generated Top 5 Success Rate: 1.0, Top 10 Success Rate: 0.75 Generated Top 5 Success Time: 500.0, Top 10 Success Time: 500.0

I would also like to see the visualization of the top5/top10 trajectories. Could you please help me visualize that?

cheryyunl commented 1 month ago

Thank you for helping us fix the bug! I will fix it now. The results look similar as my record. Generated Top 5 Success Rate: 1.0, Top 10 Success Rate: 0.75 If you want to visualize the trajectories, please save the Top 5 parameters and then deploy these policy parameters independently in Metaworld, then, save state[:3] at each step, which is the 3d position of the agent. Finally, using states to draw the trajectories, you can visualize the trajectories now.

jayaramreddy10 commented 1 month ago

Thanks, able to render the top k checkpoints.

cheryyunl commented 1 month ago

solve the deploying problem