Jasonxu1225 / Uncertainty-aware-Inverse-Constrained-Reinforcement-Learning

[ICLR 2024] ''Uncertainty-aware Constraint Inference in Inverse Constrained Reinforcement Learning'' Official Code
MIT License
8 stars 1 forks source link

Expert data dimension error when updating cflow net #1

Closed Usaywook closed 2 months ago

Usaywook commented 2 months ago

When I try the command "python train_icrl.py ../config/Mujoco/Blocked_HalfCheetah/train_UAICRL_HC-noise-1e-1.yaml -n 5 -s 123",

I've got the error as follows:

Uncertainty-aware-Inverse-Constrained-Reinforcement-Learning/constraint_models/constraint_n
et/constraint_cflow_net.py", line 496, in train_nn
    expert_data_games = expert_data.unsqueeze(0).view(expert_batch, length, self.input_dims)
RuntimeError: shape '[5, 1000, 24]' is invalid for input of size 13200

I think that self.input_dims is a concatenated state and action dims, so 24 is correct and expert_batch is the number of threads, so 5 is correct. Therefore, I guess that length is not a fixed size of 1000.

Do you know how I could solve it?

Jasonxu1225 commented 2 months ago

Hi, thanks for your interests.

For the HalfCheetah env, I think the fixed maximum length is indeed 1000 (check line 30 in mujuco_environment/custom_envs/__init__.py). And the length of the expert trajectories in the provided expert dataset for HalfCheetah env is also fixed as 1000. So I do not know why in your case the length is not 1000.

Fortunately, I have also provided the code for the case when the trajectory length is not fixed (as in Walker and Pendulum envs). The main idea is that we change the data store format so that we can get each episode's length. To do this, you should set the store_by_game in the config yaml as True, and then use the function train_nn_earlystop in line 792 in constraint_models/constraint_net/constraint_cflow_net.py, where you may need to modify the conditions in lines 518-520 in interface/train_icrl.py.

Usaywook commented 2 months ago

Hi, thank you for quickly replying to the question. I could solve the issue thanks to your advice.

When I copy the expert data from Guiliang/ICRL-benchmarks-public repository to the data/expert_data directory, it looks like the problem was caused by a conflict due to overwriting a file.

Expert data in Guiliang/ICRL-benchmarks-public repository is as follows:

Loading expert data from ../data/expert_data/BlockedHalf-cheetah/.
Expert_mean_reward: 2270.9212421442044 and Expert_mean_length: 107.84313725490196.

Expert data in your repository is as follows:

Loading expert data from ../data/expert_data/BlockedHalf-cheetah/.
Expert_mean_reward: 2606.060715970816 and Expert_mean_length: 1000.0

Using data from this remote repo helped me resolve the issue.

Is the dataset uploaded to this remote repository different from the expert data in the link above?

Jasonxu1225 commented 2 months ago

Yes, you are right. The dataset in this repository differs from the dataset in the benchmark for the HalfCheetah environment. This is because the HalfCheetah environment's maximum length is 1000. Therefore, I re-generated the expert data with a length of 1000, while the benchmark dataset has a length of 500. Alternatively, you can create your own expert dataset by following the instructions.

Hope this can solve your questions.