Open zxfever opened 3 weeks ago
Hello, When I changed the line as below
ema = EMAModel(parameters=nets, power=self.ema_power)
it started to train. However, during the inference I get another error. I will ask a question about it later.
@barsm42 have you found a solution to this? I am searching for one myself, I encountered the same problem
@barsm42 have you found a solution to this? I am searching for one myself, I encountered the same problem
Not yet, unfortunately.
@barsm42 I found it, it works but the parameters need to be set differently. try this:
python3 imitate_episodes.py \ --task_name aloha_mobile_wipe_wine \ --ckpt_dir /scr/tonyzhao/train_logs/wipe_wine_diffusion_seed0 \ --policy_class Diffusion --chunk_size 32 \ --batch_size 32 --lr 1e-4 --seed 0 \ --num_steps 100000 --eval_every 100000 --validate_every 10000 --save_every 10000
The problem is that if it's evaluating the policy for every 500, 2000 or another small stepnumber, then diffusion policy isn't working well and the simulation crashed through invalid physics state. Therefore, it needs to be higher. I achieved the best result for the command above, if you use eval_every 6000 and onscreen_render you can directly see the improvements after 6000 steps instead of 100 000 steps
@QueirosJustin Thank you for your reply.
In my code, I dont have --eval_every 100000 --validate_every 10000 --save_every 10000 args.
imitate_episodes.py gave error for those arguments.
can you post a screenshot? Those arguments are definitely part of the imitate_episode.py script, look at this section "def main(args): set_seed(1)
is_eval = args['eval']
ckpt_dir = args['ckpt_dir']
policy_class = args['policy_class']
onscreen_render = args['onscreen_render']
task_name = args['task_name']
batch_size_train = args['batch_size']
batch_size_val = args['batch_size']
num_steps = args['num_steps']
eval_every = args['eval_every']
validate_every = args['validate_every']
save_every = args['save_every']
resume_ckpt_path = args['resume_ckpt_path']"
I don't have these four arguments. I searched with Ctrl+F in the script, there are not any variables as eval_every and the rest.
eval_every = args['eval_every'] validate_every = args['validate_every'] save_every = args['save_every'] resume_ckpt_path = args['resume_ckpt_path']"
I train the model with the
--policy_class Diffusion
parameters,but the error occurred:Can you provide some help?