MarkFzp / act-plus-plus

Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN

https://mobile-aloha.github.io/

MIT License

2.86k stars 525 forks source link

Diffusion Policy #53

Open zxfever opened 3 weeks ago

zxfever commented 3 weeks ago

I train the model with the --policy_class Diffusionparameters,but the error occurred：

Traceback (most recent call last):
  File "imitate_episodes.py", line 682, in <module>
    main(vars(parser.parse_args()))
  File "imitate_episodes.py", line 187, in main
    best_ckpt_info = train_bc(train_dataloader, val_dataloader, config)
  File "imitate_episodes.py", line 562, in train_bc
    policy = make_policy(policy_class, policy_config)
  File "imitate_episodes.py", line 203, in make_policy
    policy = DiffusionPolicy(policy_config)
  File "/home/lenovo/code_ws_python/act-plus-plus/policy.py", line 72, in __init__
    ema = EMAModel(model=nets, power=self.ema_power)
TypeError: __init__() missing 1 required positional argument: 'parameters'

Can you provide some help?

barsm42 commented 2 weeks ago

Hello, When I changed the line as below

ema = EMAModel(parameters=nets, power=self.ema_power)

it started to train. However, during the inference I get another error. I will ask a question about it later.

QueirosJustin commented 1 week ago

@barsm42 have you found a solution to this? I am searching for one myself, I encountered the same problem

barsm42 commented 6 days ago

@barsm42 have you found a solution to this? I am searching for one myself, I encountered the same problem

Not yet, unfortunately.

QueirosJustin commented 6 days ago

@barsm42 I found it, it works but the parameters need to be set differently. try this:

python3 imitate_episodes.py \ --task_name aloha_mobile_wipe_wine \ --ckpt_dir /scr/tonyzhao/train_logs/wipe_wine_diffusion_seed0 \ --policy_class Diffusion --chunk_size 32 \ --batch_size 32 --lr 1e-4 --seed 0 \ --num_steps 100000 --eval_every 100000 --validate_every 10000 --save_every 10000

The problem is that if it's evaluating the policy for every 500, 2000 or another small stepnumber, then diffusion policy isn't working well and the simulation crashed through invalid physics state. Therefore, it needs to be higher. I achieved the best result for the command above, if you use eval_every 6000 and onscreen_render you can directly see the improvements after 6000 steps instead of 100 000 steps

barsm42 commented 6 days ago

@QueirosJustin Thank you for your reply.

In my code, I dont have --eval_every 100000 --validate_every 10000 --save_every 10000 args.

imitate_episodes.py gave error for those arguments.

QueirosJustin commented 6 days ago

can you post a screenshot? Those arguments are definitely part of the imitate_episode.py script, look at this section "def main(args): set_seed(1)

command line parameters

is_eval = args['eval']
ckpt_dir = args['ckpt_dir']
policy_class = args['policy_class']
onscreen_render = args['onscreen_render']
task_name = args['task_name']
batch_size_train = args['batch_size']
batch_size_val = args['batch_size']
num_steps = args['num_steps']
eval_every = args['eval_every']
validate_every = args['validate_every']
save_every = args['save_every']
resume_ckpt_path = args['resume_ckpt_path']"

barsm42 commented 6 days ago

I don't have these four arguments. I searched with Ctrl+F in the script, there are not any variables as eval_every and the rest.

eval_every = args['eval_every'] validate_every = args['validate_every'] save_every = args['save_every'] resume_ckpt_path = args['resume_ckpt_path']"