irom-princeton / PAC-Imitation

Code for Generalization Guarantees for (Multi-Modal) Imitation Learning
https://arxiv.org/abs/2008.01913
MIT License
11 stars 0 forks source link

This repository cannot be launched even push task. #1

Open leader1313 opened 2 years ago

leader1313 commented 2 years ago

When I try to launch the push experiment by python trainPush_bc.py push_pac_easy, the following errors are detected:

PAC-Imitation/trainPush_bc.py", line 523, in <module>
    config_dic, data_dic, nn_dic, loss_dic, optim_dic = [value for key, value in json_data.items()]
ValueError: not enough values to unpack (expected 5, got 4)

This issue may be caused by missing loss config information in the following JSON files: PAC-Imitation/push_pac_easy.json Could you update this the same as your manuscript?

allenzren commented 2 years ago

Hi @leader1313, thanks for your interest in the work.

Actually that json file was not used for behavior cloning training, but for the fine-tuning in the second stage. I just updated it with a new config file called push_bc_easy.json; could you give that a try? You would still need to generate the boxes though.

Originally we did not provide config for bc training, but you can find the pre-trained weights in the pretrain folder.

leader1313 commented 2 years ago

Greet @allenzren; thank you for your kind response.

Although you added JSON file, I still cannot launch your file. What exactly does this generate the boxes mean?

By the way, I am trying to implement a multi-modal imitation learning algorithm same as your multi-modal bc phase. As in your paper, you implement CVAE using LSTM for a pushing task; did you also try without the LSTM version? and is it critical for a time-series task (i.e., pushing, navigation)?

allenzren commented 2 years ago

@leader1313 You need to generate the boxes used in the pushing task. To do so, you can call python generateBox.py --obj_folder folder_path. Afterward, you will to change the obj_folder entry in the json file to folder_path.

In my work, I embed the whole sequence of trajectory into a single latent variable, and thus using a LSTM is a natural choice. It wouldn't make sense without LSTM, since then I would need to concatenate all images of the trajectory and then pass through the convolutional layers.

Alternatively, you can skip frames and stack a few frames of the trajectories, and use convolutional layers without LSTM. That could still work I think.