Open schuderer opened 5 years ago
Not really an agent thing, but did a few runs of the environment using stable-baselines' A2C (commit e994ff80c51d47283cd23c44c59c274019b96cd0), but the preliminary tests show no effective learning (10.000.000 timesteps).
also look into getting Stefano's RLACOSarsaLambda-Learner to run this environment