HumanCompatibleAI / imitation

Clean PyTorch implementations of imitation and reward learning algorithms
https://imitation.readthedocs.io/
MIT License
1.33k stars 248 forks source link

Robust termination Condition for equal horizon episodes #856

Open abastola0 opened 4 months ago

abastola0 commented 4 months ago

Bug description

Currently for some reason the termination condition that equalizes the horizon length for each rollouts is not work properly and thus generates variable horizon error.

Steps to reproduce

run any environment with 16 num_envs and generate 2000 or more rollouts. Work with environments that tend to reach early termination well before truncation.

Environment

I tried with couple environments and observed the same issue.