google-deepmind / bsuite

bsuite is a collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent
Apache License 2.0
1.51k stars 182 forks source link

Documentation: Clarify mapping from high-level agent properties to experiments and environments #31

Open aaronsnoswell opened 4 years ago

aaronsnoswell commented 4 years ago

Hello Ian and others!

I'm having a look at bsuite after Ian Osband's talk at the Simons Institute Deep RL workshop. After spending a few minutes browsing the documentation and source code here on GitHub I had a suggestion for improving the documentation.

My first question when browsing this project is "The radar plot on the readme is lovely, I wonder what experiments contribute to a good ____ score". Where the blank is e.g. 'generalization'.

After browsing the source code for a few minutes this isn't immediately obvious. I can see a little bit of information regarding this at the example colab notebook. It would be nice to promote this mapping to a 'first class' member of the documentation somewhere :)

iosband commented 3 years ago

Hi Aaron!

Apologies for the delay in getting back here... we can try and add some more documentation about where this comes from. In each experiment/sweep.py there is a list of TAGS that define which radar spokes each experiment contribute to:

e.g. https://source.corp.google.com/piper///depot/google3/third_party/py/bsuite/experiments/cartpole_noise/sweep.py

Has TAGS = ('noise', 'generalization')

If we add too much documentation separate to the code, then we run a significant risk of getting things out-of-sync. Also, I think we hope that the "colab notebook" is really meant to be a first class part of bsuite.

Probably the solution is to explain where these TAGS are located and what they mean? Would that work for you?

aaronsnoswell commented 3 years ago

Hi Ian :) Loved your talk by the way.

Yes, I think the TAGS variable(s) is/are exactly what I was looking for when browsing the documentation.

Probably the solution is to explain where these TAGS are located and what they mean?

Exactly :) This could be as simple as a heading on the readme explaining that this is how the mapping from environments to algorithm attributes is defined, with a link to one of the source files?