Changes

Fixes

Fixed some trailing commas and trailing spaces.
Fixed some pep8 issues with blank lines.
Added more documentation.

Feature Additions

Works in py2 and py3.

Added parallelism. The earlier implementation was calling process.join() within a for loop essentially running in serial (I think the purpose was not parallelism but to address some tensorflow issue). Parallelism is off by default and can be turned on by setting the --num_parallel CLI to a number greater than 1. The code will parallelize across each experiment. It will at most spawn off min(num_parallel, num_experiments).
Added default logging to tensorboard using pyrcrayon which is agnostic the deep learning framework being used (communication happens using a REST API).
- Setting up tensorboard is 2 docker commands.
- The user can turn off tensorboard logging using the --no_tb CLI.
- This does require the user to install pyrcrayon (works on both py2 and py3).
- The script will log all each experiment with the name {exp_name}-{i};{datetime}_{hostname} where i is the experiment number or avg for the average of all the runs in tensorboard.
- The user can delete existing experiments with --clear_tb_expt CLI flag. Tabs All the experiments Average of each setting Actions as histogram This does not plot the avg histogram but each experiment run separately.

Code changes needed

Added a module, tensorboard_pyrcayon.py with the tensorboard pycrayon utilities.
Updated logz.py with functions needed for tensorboard plotting.
The train_PG function now returns a dict with a history of the scalar values which we average across all the experiment runs.

Testing

Manual only. I used a working solution (not sure if its a good idea to add it here so refraining from it) and ran it under the following scenarios.

Run without a tensorboard running.

$ python train_pg.py InvertedPendulum-v1 -n 10 -b 500 -e 2 --exp_name py2-invpen-lr0.001  -rtg --n_layers 3 --size 64 --discount 0.99 -lr 0.001  --num_parallel 2
Running experiment with seed 1
Running experiment with seed 11
Logging data to data/py2-invpen-lr0.01_InvertedPendulum-v1_18-04-2018_10-44-35/1/log.txt
Logging data to data/py2-invpen-lr0.01_InvertedPendulum-v1_18-04-2018_10-44-35/11/log.txt
Traceback (most recent call last):
File "train_pg.py", line 674, in <module>
main()
File "train_pg.py", line 555, in main
history_dicts = list(map_func(train_PG_star, train_kwargs_list))
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 567, in get
raise self._value
ValueError: The server at localhost:9119 does not appear to be up!

Run without tensorboard running with --no_tb Verified in py2 and py3

$ python train_pg.py InvertedPendulum-v1 -n 10 -b 500 -e 2 --exp_name py2-invpen-lr0.001  -rtg --n_layers 3 --size 64 --discount 0.99 -lr 0.001  --num_parallel 2 --no_tb
...
********** Iteration 9 ************
-----------------------------------------
|                Time |            4.66 |
|           Iteration |               9 |
|                loss |         -0.0497 |
|          Return/Avg |            15.2 |
|          Return/Std |             9.8 |
|          Return/Max |              48 |
|          Return/Min |               4 |
|          EpLen/Mean |            15.2 |
|           EpLen/Std |             9.8 |
| Timesteps/ThisBatch |             502 |
|     Timesteps/SoFar |        5.07e+03 |
-----------------------------------------
$

Run with tensorboard I have tensorboard running in the background with $ docker run -p 9118:8888 -p 9119:8889 --name crayon alband/crayon

$ python train_pg.py InvertedPendulum-v1 -n 10 -b 500 -e 2 --exp_name py2-invpen-lr0.001  -rtg --n_layers 3 --size 64 --discount 0.99 -lr 0.001  --num_parallel 2

I see

Also ran another run and verified I saw two results in tensorboard.
I reran the command a third time with --clear_tb_expt and verified that the previous runs were erased from tensorboard.

berkeleydeeprlcourse / homework

Added tensorboard support, parallelism to hw2 #14

Changes

Fixes

Feature Additions

Code changes needed

Testing