rlberry-py / rlberry

An easy-to-use reinforcement learning library for research and education.
https://rlberry-py.github.io/rlberry
MIT License
162 stars 30 forks source link

Default value for eval_horizon #337

Open TimotheeMathieu opened 1 year ago

TimotheeMathieu commented 1 year ago

Should the default for eval_horizon be 500 ?

omardrwch commented 1 year ago

I'd keep it as large as possible (as now) and put a time limit in the environment, if necessary. To avoid hiding these choices from the user.

KohlerHECTOR commented 1 year ago

Yes it should be 500 but could be change, it does not really matter.

omardrwch commented 1 year ago

Yes it should be 500 but could be change, it does not really matter.

Why should this be 500? @KohlerHECTOR

TimotheeMathieu commented 1 year ago

This should be 500 because 500 is the default for all control gym environment and is used in most benchmarks of control environments. This may be a deep rl thing. I think that there is no default in tabular rl so I think it is best to just go with the default that exists in deep rl.

omardrwch commented 1 year ago

If the gym environment has already a time limit (at 500), any eval_horizon > 500 will do the job. So I'd keep as large as possible by default. Some atari environments have pretty huge horizons (~30k).

KohlerHECTOR commented 1 year ago

@omardrwch Sorry for the authoritarian closing. I think indeed 500 is some kind of industry standard let us say. But in any case, this could be changed by the user when they code their experiments. Plus evaluation is pretty costly so on the contrary I would keep it as low as possible :) I guess in a dream world, we would have some config files with suggested values for n_steps n_evals eval_horizonfor different envs :)

omardrwch commented 1 year ago

No worries! Ok for 500, but then let's put warning if we've reached 500 and the episode is not terminated.