Closed hejia-zhang closed 6 years ago
garage) garage (unit_testing *+) $ nose2 tests.test_algos
/anaconda2/envs/garage/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: compiletime version 3.6 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.5
return f(*args, **kwds)
objc[67403]: Class GLFWLayoutListener is implemented in both /usr/local/Cellar/glfw/3.2.1/lib/libglfw.3.2.dylib (0x134d02ae0) and /Users/jonathon/.mujoco/mjpro150/bin/libglfw.3.dylib (0x1384646d8). One of the two will be used. Which one is undefined.
objc[67403]: Class GLFWWindowDelegate is implemented in both /usr/local/Cellar/glfw/3.2.1/lib/libglfw.3.2.dylib (0x134d02b08) and /Users/jonathon/.mujoco/mjpro150/bin/libglfw.3.dylib (0x138464700). One of the two will be used. Which one is undefined.
objc[67403]: Class GLFWApplicationDelegate is implemented in both /usr/local/Cellar/glfw/3.2.1/lib/libglfw.3.2.dylib (0x134d02b80) and /Users/jonathon/.mujoco/mjpro150/bin/libglfw.3.dylib (0x138464778). One of the two will be used. Which one is undefined.
objc[67403]: Class GLFWContentView is implemented in both /usr/local/Cellar/glfw/3.2.1/lib/libglfw.3.2.dylib (0x134d02ba8) and /Users/jonathon/.mujoco/mjpro150/bin/libglfw.3.dylib (0x1384647a0).One of the two will be used. Which one is undefined.
objc[67403]: Class GLFWWindow is implemented in both /usr/local/Cellar/glfw/3.2.1/lib/libglfw.3.2.dylib (0x134d02c20) and /Users/jonathon/.mujoco/mjpro150/bin/libglfw.3.dylib (0x138464818). One of the two will be used. Which one is undefined.
objc[67403]: Class GLFWApplication is implemented in both /usr/local/Cellar/glfw/3.2.1/lib/libglfw.3.2.dylib (0x134d02c48) and /Users/jonathon/.mujoco/mjpro150/bin/libglfw.3.dylib (0x138464840).One of the two will be used. Which one is undefined.
Testing VPG, GridWorldEnv, CategoricalMLPPolicy
2018-07-12 17:18:05.434337 PDT | Populating workers...
2018-07-12 17:18:05.434553 PDT | Populated
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
2018-07-12 17:18:06.652650 PDT | itr #0 | fitting baseline...
2018-07-12 17:18:06.652898 PDT | itr #0 | fitted
2018-07-12 17:18:06.653472 PDT | itr #0 | optimizing policy
2018-07-12 17:18:06.822807 PDT | itr #0 | saving snapshot...
2018-07-12 17:18:06.822984 PDT | itr #0 | saved
2018-07-12 17:18:06.823589 PDT | ----------------------- ------------
2018-07-12 17:18:06.823712 PDT | AverageDiscountedReturn 0
2018-07-12 17:18:06.823816 PDT | AverageReturn 0
2018-07-12 17:18:06.823915 PDT | Entropy 1.32786
2018-07-12 17:18:06.824011 PDT | ExplainedVariance 1
2018-07-12 17:18:06.824123 PDT | Iteration 0
2018-07-12 17:18:06.824234 PDT | LossAfter -0
2018-07-12 17:18:06.824333 PDT | LossBefore -0
2018-07-12 17:18:06.824430 PDT | MaxKL 4.44089e-16
2018-07-12 17:18:06.824528 PDT | MaxReturn 0
2018-07-12 17:18:06.824654 PDT | MeanKL 4.54736e-17
2018-07-12 17:18:06.824773 PDT | MinReturn 0
2018-07-12 17:18:06.824892 PDT | NumTrajs 94
2018-07-12 17:18:06.824987 PDT | Perplexity 3.77294
2018-07-12 17:18:06.825084 PDT | StdReturn 0
2018-07-12 17:18:06.825177 PDT | ----------------------- ------------
.Testing PPO, CartpoleEnv, GaussianMLPPolicy
2018-07-12 17:18:06.879008 PDT | Populating workers...
2018-07-12 17:18:06.879166 PDT | Populated
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
2018-07-12 17:18:07.704745 PDT | itr #0 | fitting baseline...
2018-07-12 17:18:07.704992 PDT | itr #0 | fitted
=: Compiling function f_loss
done in 0.044 seconds
=: Compiling function f_constraint
done in 0.031 seconds
=: Compiling function f_opt
done in 0.084 seconds
=: Compiling function f_penalized_loss
done in 0.041 seconds
2018-07-12 17:18:07.968917 PDT | itr #0 | trying penalty=1.000...
2018-07-12 17:18:07.990827 PDT | itr #0 | penalty 1.000000 => loss -0.003368, mean_kl 0.001503
2018-07-12 17:18:07.994845 PDT | itr #0 | saving snapshot...
2018-07-12 17:18:07.994994 PDT | itr #0 | saved
2018-07-12 17:18:07.995571 PDT | ----------------------- -------------
2018-07-12 17:18:07.995688 PDT | AverageDiscountedReturn 85.0051
2018-07-12 17:18:07.995790 PDT | AveragePolicyStd 1
2018-07-12 17:18:07.995888 PDT | AverageReturn 90.2474
2018-07-12 17:18:07.995984 PDT | Entropy 1.41894
2018-07-12 17:18:07.996080 PDT | ExplainedVariance 3.49276e-12
2018-07-12 17:18:07.996187 PDT | Iteration 0
2018-07-12 17:18:07.996288 PDT | LossAfter -0.00336813
2018-07-12 17:18:07.996383 PDT | LossBefore 4.2505e-17
2018-07-12 17:18:07.996478 PDT | MaxReturn 289.953
2018-07-12 17:18:07.996574 PDT | MeanKL 0.00150263
2018-07-12 17:18:07.996670 PDT | MeanKLBefore 0
2018-07-12 17:18:07.996766 PDT | MinReturn 19.9659
2018-07-12 17:18:07.996897 PDT | NumTrajs 100
2018-07-12 17:18:07.996995 PDT | Perplexity 4.13273
2018-07-12 17:18:07.997090 PDT | StdReturn 61.6312
2018-07-12 17:18:07.997186 PDT | dLoss 0.00336813
2018-07-12 17:18:07.997296 PDT | ----------------------- -------------
.Testing PPO, GridWorldEnv, CategoricalGRUPolicy
ETesting PPO, CartpoleEnv, GaussianGRUPolicy
ETesting TRPO, GridWorldEnv, CategoricalMLPPolicy
2018-07-12 17:18:08.020207 PDT | Populating workers...
2018-07-12 17:18:08.020368 PDT | Populated
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
2018-07-12 17:18:08.368218 PDT | itr #0 | fitting baseline...
2018-07-12 17:18:08.368386 PDT | itr #0 | fitted
=: Compiling function f_loss
done in 0.013 seconds
=: Compiling function constraint
done in 0.012 seconds
2018-07-12 17:18:08.400174 PDT | itr #0 | computing loss before
2018-07-12 17:18:08.404809 PDT | itr #0 | performing update
2018-07-12 17:18:08.404972 PDT | itr #0 | computing descent direction
=: Compiling function f_grad
done in 0.037 seconds
=: Compiling function f_Hx_plain
done in 0.229 seconds
2018-07-12 17:18:08.811891 PDT | itr #0 | descent direction computed
=: Compiling function f_loss_constraint
done in 0.017 seconds
2018-07-12 17:18:08.839745 PDT | itr #0 | backtrack iters: 1
2018-07-12 17:18:08.840029 PDT | itr #0 | computing loss after
2018-07-12 17:18:08.840173 PDT | itr #0 | optimization finished
2018-07-12 17:18:08.845117 PDT | itr #0 | saving snapshot...
2018-07-12 17:18:08.845254 PDT | itr #0 | saved
2018-07-12 17:18:08.845855 PDT | ----------------------- -------------
2018-07-12 17:18:08.845973 PDT | AverageDiscountedReturn 0.0256207
2018-07-12 17:18:08.846078 PDT | AverageReturn 0.027972
2018-07-12 17:18:08.846174 PDT | Entropy 1.37232
2018-07-12 17:18:08.846274 PDT | ExplainedVariance 2.94968e-07
2018-07-12 17:18:08.846375 PDT | Iteration 0
2018-07-12 17:18:08.846469 PDT | LossAfter -0.0197037
2018-07-12 17:18:08.846568 PDT | LossBefore -7.02117e-17
2018-07-12 17:18:08.846664 PDT | MaxReturn 1
2018-07-12 17:18:08.846759 PDT | MeanKL 0.00651005
2018-07-12 17:18:08.846856 PDT | MeanKLBefore 5.48339e-17
2018-07-12 17:18:08.846953 PDT | MinReturn 0
2018-07-12 17:18:08.847048 PDT | NumTrajs 143
2018-07-12 17:18:08.847142 PDT | Perplexity 3.94451
2018-07-12 17:18:08.847240 PDT | StdReturn 0.164893
2018-07-12 17:18:08.847349 PDT | dLoss 0.0197037
2018-07-12 17:18:08.847454 PDT | ----------------------- -------------
.Testing TRPO, CartpoleEnv, GaussianMLPPolicy
2018-07-12 17:18:08.888915 PDT | Populating workers...
2018-07-12 17:18:08.889079 PDT | Populated
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
2018-07-12 17:18:09.775756 PDT | itr #0 | fitting baseline...
2018-07-12 17:18:09.775904 PDT | itr #0 | fitted
=: Compiling function f_loss
done in 0.032 seconds
=: Compiling function constraint
done in 0.029 seconds
2018-07-12 17:18:09.841594 PDT | itr #0 | computing loss before
2018-07-12 17:18:09.843658 PDT | itr #0 | performing update
2018-07-12 17:18:09.843790 PDT | itr #0 | computing descent direction
=: Compiling function f_grad
done in 0.063 seconds
=: Compiling function f_Hx_plain
done in 0.145 seconds
2018-07-12 17:18:10.203289 PDT | itr #0 | descent direction computed
=: Compiling function f_loss_constraint
done in 0.040 seconds
2018-07-12 17:18:10.249139 PDT | itr #0 | backtrack iters: 1
2018-07-12 17:18:10.249296 PDT | itr #0 | computing loss after
2018-07-12 17:18:10.249412 PDT | itr #0 | optimization finished
2018-07-12 17:18:10.253759 PDT | itr #0 | saving snapshot...
2018-07-12 17:18:10.253902 PDT | itr #0 | saved
2018-07-12 17:18:10.254471 PDT | ----------------------- -------------
2018-07-12 17:18:10.254592 PDT | AverageDiscountedReturn 84.2759
2018-07-12 17:18:10.254704 PDT | AveragePolicyStd 1
2018-07-12 17:18:10.254825 PDT | AverageReturn 89.351
2018-07-12 17:18:10.254926 PDT | Entropy 1.41894
2018-07-12 17:18:10.255023 PDT | ExplainedVariance 3.62788e-12
2018-07-12 17:18:10.255120 PDT | Iteration 0
2018-07-12 17:18:10.255223 PDT | LossAfter -0.00875854
2018-07-12 17:18:10.255334 PDT | LossBefore -5.6617e-17
2018-07-12 17:18:10.255435 PDT | MaxReturn 279.955
2018-07-12 17:18:10.255537 PDT | MeanKL 0.00698265
2018-07-12 17:18:10.255643 PDT | MeanKLBefore 0
2018-07-12 17:18:10.255762 PDT | MinReturn 9.98262
2018-07-12 17:18:10.255869 PDT | NumTrajs 101
2018-07-12 17:18:10.255975 PDT | Perplexity 4.13273
2018-07-12 17:18:10.256121 PDT | StdReturn 59.8792
2018-07-12 17:18:10.256219 PDT | dLoss 0.00875854
2018-07-12 17:18:10.256320 PDT | ----------------------- -------------
.Testing TRPO, GridWorldEnv, CategoricalGRUPolicy
ETesting TRPO, CartpoleEnv, GaussianGRUPolicy
ETesting CEM, GridWorldEnv, CategoricalMLPPolicy
2018-07-12 17:18:10.279630 PDT | Populating workers...
2018-07-12 17:18:10.279786 PDT | Populated
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
((5, 1732), (5,))
2018-07-12 17:18:10.475964 PDT | ----------------------- ---
2018-07-12 17:18:10.476101 PDT | AverageDiscountedReturn 0
2018-07-12 17:18:10.476232 PDT | AverageReturn 0
2018-07-12 17:18:10.476354 PDT | AvgTrajLen 100
2018-07-12 17:18:10.476499 PDT | CurStdMean 0
2018-07-12 17:18:10.476618 PDT | Iteration 0
2018-07-12 17:18:10.476711 PDT | MaxReturn 0
2018-07-12 17:18:10.476816 PDT | MinReturn 0
2018-07-12 17:18:10.476906 PDT | NumTrajs 5
2018-07-12 17:18:10.476995 PDT | StdReturn 0
2018-07-12 17:18:10.477091 PDT | ----------------------- ---
.Testing CEM, CartpoleEnv, GaussianMLPPolicy
2018-07-12 17:18:10.515660 PDT | Populating workers...
2018-07-12 17:18:10.515825 PDT | Populated
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
((5, 1250), (5,))
2018-07-12 17:18:10.556580 PDT | ----------------------- ---------
2018-07-12 17:18:10.556755 PDT | AverageDiscountedReturn 63.122
2018-07-12 17:18:10.556876 PDT | AveragePolicyStd 4.24531
2018-07-12 17:18:10.556987 PDT | AverageReturn 65.9403
2018-07-12 17:18:10.557101 PDT | AvgTrajLen 7.6
2018-07-12 17:18:10.557208 PDT | CurStdMean 0
2018-07-12 17:18:10.557300 PDT | Iteration 0
2018-07-12 17:18:10.557392 PDT | MaxReturn 149.864
2018-07-12 17:18:10.557503 PDT | MinReturn 9.98724
2018-07-12 17:18:10.557607 PDT | NumTrajs 5
2018-07-12 17:18:10.557698 PDT | StdReturn 46.2636
2018-07-12 17:18:10.557788 PDT | ----------------------- ---------
.Testing CEM, GridWorldEnv, CategoricalGRUPolicy
ETesting VPG, CartpoleEnv, GaussianMLPPolicy
2018-07-12 17:18:10.598692 PDT | Populating workers...
2018-07-12 17:18:10.598850 PDT | Populated
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
2018-07-12 17:18:11.638600 PDT | itr #0 | fitting baseline...
2018-07-12 17:18:11.638815 PDT | itr #0 | fitted
2018-07-12 17:18:11.639597 PDT | itr #0 | optimizing policy
0% [#] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
2018-07-12 17:18:11.787179 PDT | itr #0 | saving snapshot...
2018-07-12 17:18:11.787342 PDT | itr #0 | saved
2018-07-12 17:18:11.787898 PDT | ----------------------- -------------
2018-07-12 17:18:11.788017 PDT | AverageDiscountedReturn 78.0664
2018-07-12 17:18:11.788121 PDT | AveragePolicyStd 1
2018-07-12 17:18:11.788229 PDT | AverageReturn 82.0571
2018-07-12 17:18:11.788331 PDT | Entropy 1.41894
2018-07-12 17:18:11.788428 PDT | ExplainedVariance 5.30664e-12
2018-07-12 17:18:11.788527 PDT | Iteration 0
2018-07-12 17:18:11.788623 PDT | LossAfter 0.00784446
2018-07-12 17:18:11.788724 PDT | LossBefore 0.00862184
2018-07-12 17:18:11.788818 PDT | MaxKL 0.000575431
2018-07-12 17:18:11.788917 PDT | MaxReturn 269.931
2018-07-12 17:18:11.789014 PDT | MeanKL 0.000120727
2018-07-12 17:18:11.789111 PDT | MinReturn 9.98211
2018-07-12 17:18:11.789207 PDT | NumTrajs 109
2018-07-12 17:18:11.789305 PDT | Perplexity 4.13273
2018-07-12 17:18:11.789404 PDT | StdReturn 48.9121
2018-07-12 17:18:11.789504 PDT | ----------------------- -------------
.Testing CEM, CartpoleEnv, GaussianGRUPolicy
ETesting CMAES, GridWorldEnv, CategoricalMLPPolicy
(13_w,26)-aCMA-ES (mu_w=7.6,w_1=23%) in dimension 1732 (seed=552301, Thu Jul 12 17:18:11 2018)
2018-07-12 17:18:11.845200 PDT | Populating workers...
2018-07-12 17:18:11.845370 PDT | Populated
2018-07-12 17:18:12.314701 PDT | ----------------------- ----
2018-07-12 17:18:12.314862 PDT | AverageDiscountedReturn 0
2018-07-12 17:18:12.314970 PDT | AverageReturn 0
2018-07-12 17:18:12.315068 PDT | AvgTrajLen 51.8
2018-07-12 17:18:12.315168 PDT | CurStdMean 1
2018-07-12 17:18:12.315263 PDT | Iteration 0
2018-07-12 17:18:12.315358 PDT | MaxReturn 0
2018-07-12 17:18:12.315453 PDT | MinReturn 0
2018-07-12 17:18:12.315548 PDT | StdReturn 0
2018-07-12 17:18:12.315643 PDT | ----------------------- ----
.Testing CMAES, CartpoleEnv, GaussianMLPPolicy
(12_w,25)-aCMA-ES (mu_w=7.3,w_1=23%) in dimension 1250 (seed=502951, Thu Jul 12 17:18:12 2018)
2018-07-12 17:18:12.402289 PDT | Populating workers...
2018-07-12 17:18:12.402456 PDT | Populated
2018-07-12 17:18:13.520446 PDT | ----------------------- ---------
2018-07-12 17:18:13.520619 PDT | AverageDiscountedReturn -63.5092
2018-07-12 17:18:13.520728 PDT | AveragePolicyStd 1.54438
2018-07-12 17:18:13.520839 PDT | AverageReturn 67.712
2018-07-12 17:18:13.520941 PDT | AvgTrajLen 7.77519
2018-07-12 17:18:13.521043 PDT | CurStdMean 1
2018-07-12 17:18:13.521144 PDT | Iteration 0
2018-07-12 17:18:13.521243 PDT | MaxReturn 529.721
2018-07-12 17:18:13.521343 PDT | MinReturn 0
2018-07-12 17:18:13.521441 PDT | StdReturn 67.712
2018-07-12 17:18:13.521558 PDT | ----------------------- ---------
.Testing CMAES, GridWorldEnv, CategoricalGRUPolicy
ETesting CMAES, CartpoleEnv, GaussianGRUPolicy
ETesting ERWR, GridWorldEnv, CategoricalMLPPolicy
2018-07-12 17:18:13.545808 PDT | Populating workers...
2018-07-12 17:18:13.545964 PDT | Populated
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
2018-07-12 17:18:13.866733 PDT | itr #0 | fitting baseline...
2018-07-12 17:18:13.866938 PDT | itr #0 | fitted
2018-07-12 17:18:13.867437 PDT | itr #0 | optimizing policy
2018-07-12 17:18:14.191413 PDT | itr #0 | saving snapshot...
2018-07-12 17:18:14.191578 PDT | itr #0 | saved
2018-07-12 17:18:14.192125 PDT | ----------------------- -------------
2018-07-12 17:18:14.192243 PDT | AverageDiscountedReturn 0.00617475
2018-07-12 17:18:14.192346 PDT | AverageReturn 0.00689655
2018-07-12 17:18:14.192442 PDT | Entropy 1.37282
2018-07-12 17:18:14.192538 PDT | ExplainedVariance 9.51842e-07
2018-07-12 17:18:14.192634 PDT | Iteration 0
2018-07-12 17:18:14.192731 PDT | LossAfter 0.0388689
2018-07-12 17:18:14.192826 PDT | LossBefore 0.162959
2018-07-12 17:18:14.192920 PDT | MaxKL 11.9146
2018-07-12 17:18:14.193014 PDT | MaxReturn 1
2018-07-12 17:18:14.193131 PDT | MeanKL 8.76122
2018-07-12 17:18:14.193272 PDT | MinReturn 0
2018-07-12 17:18:14.193411 PDT | NumTrajs 145
2018-07-12 17:18:14.193510 PDT | Perplexity 3.94648
2018-07-12 17:18:14.193608 PDT | StdReturn 0.0827586
2018-07-12 17:18:14.193715 PDT | ----------------------- -------------
.Testing ERWR, CartpoleEnv, GaussianMLPPolicy
2018-07-12 17:18:14.233571 PDT | Populating workers...
2018-07-12 17:18:14.233735 PDT | Populated
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
2018-07-12 17:18:15.128063 PDT | itr #0 | fitting baseline...
2018-07-12 17:18:15.128219 PDT | itr #0 | fitted
2018-07-12 17:18:15.128774 PDT | itr #0 | optimizing policy
2018-07-12 17:18:15.364243 PDT | itr #0 | saving snapshot...
2018-07-12 17:18:15.364410 PDT | itr #0 | saved
2018-07-12 17:18:15.364963 PDT | ----------------------- -------------
2018-07-12 17:18:15.365081 PDT | AverageDiscountedReturn 77.605
2018-07-12 17:18:15.365184 PDT | AveragePolicyStd 1
2018-07-12 17:18:15.365282 PDT | AverageReturn 81.6906
2018-07-12 17:18:15.365380 PDT | Entropy 1.41894
2018-07-12 17:18:15.365475 PDT | ExplainedVariance 4.66771e-12
2018-07-12 17:18:15.365572 PDT | Iteration 0
2018-07-12 17:18:15.365683 PDT | LossAfter 1.65393
2018-07-12 17:18:15.365790 PDT | LossBefore 1.67059
2018-07-12 17:18:15.365897 PDT | MaxKL 0.217659
2018-07-12 17:18:15.365999 PDT | MaxReturn 289.942
2018-07-12 17:18:15.366113 PDT | MeanKL 0.0341559
2018-07-12 17:18:15.366210 PDT | MinReturn 9.98382
2018-07-12 17:18:15.366314 PDT | NumTrajs 109
2018-07-12 17:18:15.366471 PDT | Perplexity 4.13273
2018-07-12 17:18:15.366637 PDT | StdReturn 51.7319
2018-07-12 17:18:15.366794 PDT | ----------------------- -------------
.Testing ERWR, GridWorldEnv, CategoricalGRUPolicy
ETesting ERWR, CartpoleEnv, GaussianGRUPolicy
ETesting REPS, GridWorldEnv, CategoricalMLPPolicy
2018-07-12 17:18:15.391867 PDT | Populating workers...
2018-07-12 17:18:15.392030 PDT | Populated
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
2018-07-12 17:18:16.104746 PDT | itr #0 | fitting baseline...
2018-07-12 17:18:16.104912 PDT | itr #0 | fitted
2018-07-12 17:18:16.108707 PDT | itr #0 | optimizing dual
2018-07-12 17:18:16.119150 PDT | itr #0 | optimizing policy
2018-07-12 17:18:16.183009 PDT | itr #0 | eta 15.000000 -> 14.500671
2018-07-12 17:18:16.183383 PDT | itr #0 | saving snapshot...
2018-07-12 17:18:16.183573 PDT | itr #0 | saved
2018-07-12 17:18:16.184221 PDT | ----------------------- ------------
2018-07-12 17:18:16.184347 PDT | AverageDiscountedReturn 0.0156816
2018-07-12 17:18:16.184489 PDT | AverageReturn 0.0190476
2018-07-12 17:18:16.184653 PDT | DualAfter 7.10131
2018-07-12 17:18:16.184795 PDT | DualBefore 7.38151
2018-07-12 17:18:16.184935 PDT | Entropy 1.34888
2018-07-12 17:18:16.185089 PDT | ExplainedVariance 3.1684e-07
2018-07-12 17:18:16.185209 PDT | Iteration 0
2018-07-12 17:18:16.185379 PDT | LossAfter 1.24697
2018-07-12 17:18:16.185494 PDT | LossBefore 1.25007
2018-07-12 17:18:16.185608 PDT | MaxReturn 1
2018-07-12 17:18:16.185721 PDT | MeanKL 0.00308145
2018-07-12 17:18:16.185847 PDT | MinReturn 0
2018-07-12 17:18:16.185964 PDT | NumTrajs 105
2018-07-12 17:18:16.186134 PDT | Perplexity 3.85313
2018-07-12 17:18:16.186309 PDT | StdReturn 0.136692
2018-07-12 17:18:16.186408 PDT | ----------------------- ------------
.Testing VPG, GridWorldEnv, CategoricalGRUPolicy
ETesting REPS, CartpoleEnv, GaussianMLPPolicy
2018-07-12 17:18:16.229129 PDT | Populating workers...
2018-07-12 17:18:16.229288 PDT | Populated
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
2018-07-12 17:18:17.363177 PDT | itr #0 | fitting baseline...
2018-07-12 17:18:17.363324 PDT | itr #0 | fitted
2018-07-12 17:18:17.366622 PDT | itr #0 | optimizing dual
2018-07-12 17:18:17.375464 PDT | itr #0 | optimizing policy
2018-07-12 17:18:17.407505 PDT | itr #0 | eta 15.000000 -> 14.514979
2018-07-12 17:18:17.407745 PDT | itr #0 | saving snapshot...
2018-07-12 17:18:17.407886 PDT | itr #0 | saved
2018-07-12 17:18:17.408466 PDT | ----------------------- -------------
2018-07-12 17:18:17.408586 PDT | AverageDiscountedReturn 93.1336
2018-07-12 17:18:17.408689 PDT | AveragePolicyStd 1
2018-07-12 17:18:17.408791 PDT | AverageReturn 98.7464
2018-07-12 17:18:17.408890 PDT | DualAfter 16.5544
2018-07-12 17:18:17.409011 PDT | DualBefore 16.7919
2018-07-12 17:18:17.409125 PDT | Entropy 1.41894
2018-07-12 17:18:17.409224 PDT | ExplainedVariance 4.58389e-12
2018-07-12 17:18:17.409350 PDT | Iteration 0
2018-07-12 17:18:17.409460 PDT | LossAfter 1.30829
2018-07-12 17:18:17.409572 PDT | LossBefore 1.3085
2018-07-12 17:18:17.409668 PDT | MaxReturn 229.943
2018-07-12 17:18:17.409772 PDT | MeanKL 0.000213669
2018-07-12 17:18:17.409885 PDT | MinReturn 0
2018-07-12 17:18:17.409988 PDT | NumTrajs 92
2018-07-12 17:18:17.410084 PDT | Perplexity 4.13273
2018-07-12 17:18:17.410182 PDT | StdReturn 54.4456
2018-07-12 17:18:17.410279 PDT | ----------------------- -------------
.Testing REPS, GridWorldEnv, CategoricalGRUPolicy
ETesting REPS, CartpoleEnv, GaussianGRUPolicy
ETesting VPG, CartpoleEnv, GaussianGRUPolicy
ETesting TNPG, GridWorldEnv, CategoricalMLPPolicy
2018-07-12 17:18:17.440602 PDT | Populating workers...
2018-07-12 17:18:17.440759 PDT | Populated
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
2018-07-12 17:18:17.793035 PDT | itr #0 | fitting baseline...
2018-07-12 17:18:17.793186 PDT | itr #0 | fitted
=: Compiling function f_loss
done in 0.014 seconds
=: Compiling function constraint
done in 0.012 seconds
2018-07-12 17:18:17.823982 PDT | itr #0 | computing loss before
2018-07-12 17:18:17.827797 PDT | itr #0 | performing update
2018-07-12 17:18:17.827967 PDT | itr #0 | computing descent direction
=: Compiling function f_grad
done in 0.126 seconds
=: Compiling function f_Hx_plain
done in 0.083 seconds
2018-07-12 17:18:18.180413 PDT | itr #0 | descent direction computed
=: Compiling function f_loss_constraint
done in 0.017 seconds
2018-07-12 17:18:18.202089 PDT | itr #0 | Line search condition violated. Rejecting the step!
2018-07-12 17:18:18.202260 PDT | itr #0 | Violated because loss is NaN
2018-07-12 17:18:18.202409 PDT | itr #0 | Violated because constraint mean_kl is NaN
2018-07-12 17:18:18.202789 PDT | itr #0 | backtrack iters: 0
2018-07-12 17:18:18.202913 PDT | itr #0 | computing loss after
2018-07-12 17:18:18.203062 PDT | itr #0 | optimization finished
2018-07-12 17:18:18.208529 PDT | itr #0 | saving snapshot...
2018-07-12 17:18:18.208681 PDT | itr #0 | saved
2018-07-12 17:18:18.209243 PDT | ----------------------- -------------
2018-07-12 17:18:18.209363 PDT | AverageDiscountedReturn 0
2018-07-12 17:18:18.209467 PDT | AverageReturn 0
2018-07-12 17:18:18.209568 PDT | Entropy 1.37305
2018-07-12 17:18:18.209664 PDT | ExplainedVariance 1
2018-07-12 17:18:18.209763 PDT | Iteration 0
2018-07-12 17:18:18.209874 PDT | LossAfter -0
2018-07-12 17:18:18.210038 PDT | LossBefore -0
2018-07-12 17:18:18.210174 PDT | MaxReturn 0
2018-07-12 17:18:18.210287 PDT | MeanKL -4.30039e-17
2018-07-12 17:18:18.210428 PDT | MeanKLBefore -4.30039e-17
2018-07-12 17:18:18.210574 PDT | MinReturn 0
2018-07-12 17:18:18.210691 PDT | NumTrajs 146
2018-07-12 17:18:18.210802 PDT | Perplexity 3.94736
2018-07-12 17:18:18.210909 PDT | StdReturn 0
2018-07-12 17:18:18.211005 PDT | dLoss 0
2018-07-12 17:18:18.211102 PDT | ----------------------- -------------
.Testing TNPG, CartpoleEnv, GaussianMLPPolicy
2018-07-12 17:18:18.251125 PDT | Populating workers...
2018-07-12 17:18:18.251316 PDT | Populated
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
2018-07-12 17:18:19.157859 PDT | itr #0 | fitting baseline...
2018-07-12 17:18:19.158024 PDT | itr #0 | fitted
=: Compiling function f_loss
done in 0.032 seconds
=: Compiling function constraint
done in 0.029 seconds
2018-07-12 17:18:19.224009 PDT | itr #0 | computing loss before
2018-07-12 17:18:19.226149 PDT | itr #0 | performing update
2018-07-12 17:18:19.226287 PDT | itr #0 | computing descent direction
=: Compiling function f_grad
done in 0.065 seconds
=: Compiling function f_Hx_plain
done in 0.131 seconds
2018-07-12 17:18:19.573078 PDT | itr #0 | descent direction computed
=: Compiling function f_loss_constraint
done in 0.128 seconds
2018-07-12 17:18:19.703974 PDT | itr #0 | Line search condition violated. Rejecting the step!
2018-07-12 17:18:19.704142 PDT | itr #0 | Violated because constraint mean_kl is violated
2018-07-12 17:18:19.704492 PDT | itr #0 | backtrack iters: 0
2018-07-12 17:18:19.704606 PDT | itr #0 | computing loss after
2018-07-12 17:18:19.704712 PDT | itr #0 | optimization finished
2018-07-12 17:18:19.708249 PDT | itr #0 | saving snapshot...
2018-07-12 17:18:19.708384 PDT | itr #0 | saved
2018-07-12 17:18:19.708972 PDT | ----------------------- -------------
2018-07-12 17:18:19.709091 PDT | AverageDiscountedReturn 75.1683
2018-07-12 17:18:19.709195 PDT | AveragePolicyStd 1
2018-07-12 17:18:19.709299 PDT | AverageReturn 78.798
2018-07-12 17:18:19.709405 PDT | Entropy 1.41894
2018-07-12 17:18:19.709503 PDT | ExplainedVariance 5.97189e-12
2018-07-12 17:18:19.709636 PDT | Iteration 0
2018-07-12 17:18:19.709747 PDT | LossAfter -3.53856e-18
2018-07-12 17:18:19.709878 PDT | LossBefore -3.53856e-18
2018-07-12 17:18:19.709978 PDT | MaxReturn 229.949
2018-07-12 17:18:19.710071 PDT | MeanKL 0
2018-07-12 17:18:19.710172 PDT | MeanKLBefore 0
2018-07-12 17:18:19.710290 PDT | MinReturn 9.98231
2018-07-12 17:18:19.710384 PDT | NumTrajs 113
2018-07-12 17:18:19.710495 PDT | Perplexity 4.13273
2018-07-12 17:18:19.710589 PDT | StdReturn 45.943
2018-07-12 17:18:19.710682 PDT | dLoss 0
2018-07-12 17:18:19.710780 PDT | ----------------------- -------------
.Testing TNPG, GridWorldEnv, CategoricalGRUPolicy
ETesting TNPG, CartpoleEnv, GaussianGRUPolicy
ETesting PPO, GridWorldEnv, CategoricalMLPPolicy
2018-07-12 17:18:19.734499 PDT | Populating workers...
2018-07-12 17:18:19.734703 PDT | Populated
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00
2018-07-12 17:18:20.039431 PDT | itr #0 | fitting baseline...
2018-07-12 17:18:20.039587 PDT | itr #0 | fitted
=: Compiling function f_loss
done in 0.013 seconds
=: Compiling function f_constraint
done in 0.012 seconds
=: Compiling function f_opt
done in 0.042 seconds
=: Compiling function f_penalized_loss
done in 0.019 seconds
2018-07-12 17:18:20.165001 PDT | itr #0 | trying penalty=1.000...
2018-07-12 17:18:20.185164 PDT | itr #0 | penalty 1.000000 => loss -0.000000, mean_kl -0.000000
2018-07-12 17:18:20.190687 PDT | itr #0 | saving snapshot...
2018-07-12 17:18:20.190842 PDT | itr #0 | saved
2018-07-12 17:18:20.191414 PDT | ----------------------- -------------
2018-07-12 17:18:20.191537 PDT | AverageDiscountedReturn 0
2018-07-12 17:18:20.191674 PDT | AverageReturn 0
2018-07-12 17:18:20.191811 PDT | Entropy 1.35846
2018-07-12 17:18:20.191923 PDT | ExplainedVariance 1
2018-07-12 17:18:20.192060 PDT | Iteration 0
2018-07-12 17:18:20.192201 PDT | LossAfter -0
2018-07-12 17:18:20.192300 PDT | LossBefore -0
2018-07-12 17:18:20.192400 PDT | MaxReturn 0
2018-07-12 17:18:20.192496 PDT | MeanKL -2.75544e-17
2018-07-12 17:18:20.192592 PDT | MeanKLBefore -2.75544e-17
2018-07-12 17:18:20.192689 PDT | MinReturn 0
2018-07-12 17:18:20.192797 PDT | NumTrajs 124
2018-07-12 17:18:20.192891 PDT | Perplexity 3.89021
2018-07-12 17:18:20.192986 PDT | StdReturn 0
2018-07-12 17:18:20.193083 PDT | dLoss 0
2018-07-12 17:18:20.193180 PDT | ----------------------- -------------
.
======================================================================
ERROR: test_polopt_algo:11
<class 'garage.algos.ppo.PPO'>, <class 'garage.envs.grid_world_env.GridWorldEnv (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/categorical_gru_policy.py", line 69, in __init__
name="prob_network")
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:12
<class 'garage.algos.ppo.PPO'>, <class 'garage.envs.box2d.cartpole_env.Cartpole (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/gaussian_gru_policy.py", line 51, in __init__
output_nonlinearity=output_nonlinearity,
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:15
<class 'garage.algos.trpo.TRPO'>, <class 'garage.envs.grid_world_env.GridWorldE (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/categorical_gru_policy.py", line 69, in __init__
name="prob_network")
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:16
<class 'garage.algos.trpo.TRPO'>, <class 'garage.envs.box2d.cartpole_env.Cartpo (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/gaussian_gru_policy.py", line 51, in __init__
output_nonlinearity=output_nonlinearity,
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:19
<class 'garage.algos.cem.CEM'>, <class 'garage.envs.grid_world_env.GridWorldEnv (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/categorical_gru_policy.py", line 69, in __init__
name="prob_network")
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:20
<class 'garage.algos.cem.CEM'>, <class 'garage.envs.box2d.cartpole_env.Cartpole (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/gaussian_gru_policy.py", line 51, in __init__
output_nonlinearity=output_nonlinearity,
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:23
<class 'garage.algos.cma_es.CMAES'>, <class 'garage.envs.grid_world_env.GridWor (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/categorical_gru_policy.py", line 69, in __init__
name="prob_network")
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:24
<class 'garage.algos.cma_es.CMAES'>, <class 'garage.envs.box2d.cartpole_env.Car (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/gaussian_gru_policy.py", line 51, in __init__
output_nonlinearity=output_nonlinearity,
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:27
<class 'garage.algos.erwr.ERWR'>, <class 'garage.envs.grid_world_env.GridWorldE (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/categorical_gru_policy.py", line 69, in __init__
name="prob_network")
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:28
<class 'garage.algos.erwr.ERWR'>, <class 'garage.envs.box2d.cartpole_env.Cartpo (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/gaussian_gru_policy.py", line 51, in __init__
output_nonlinearity=output_nonlinearity,
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:3
<class 'garage.algos.vpg.VPG'>, <class 'garage.envs.grid_world_env.GridWorldEnv (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/categorical_gru_policy.py", line 69, in __init__
name="prob_network")
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:31
<class 'garage.algos.reps.REPS'>, <class 'garage.envs.grid_world_env.GridWorldE (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/categorical_gru_policy.py", line 69, in __init__
name="prob_network")
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:32
<class 'garage.algos.reps.REPS'>, <class 'garage.envs.box2d.cartpole_env.Cartpo (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/gaussian_gru_policy.py", line 51, in __init__
output_nonlinearity=output_nonlinearity,
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:4
<class 'garage.algos.vpg.VPG'>, <class 'garage.envs.box2d.cartpole_env.Cartpole (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/gaussian_gru_policy.py", line 51, in __init__
output_nonlinearity=output_nonlinearity,
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:7
<class 'garage.algos.tnpg.TNPG'>, <class 'garage.envs.grid_world_env.GridWorldE (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/categorical_gru_policy.py", line 69, in __init__
name="prob_network")
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
======================================================================
ERROR: test_polopt_algo:8
<class 'garage.algos.tnpg.TNPG'>, <class 'garage.envs.box2d.cartpole_env.Cartpo (tests.test_algos.TestAlgos)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/jonathon/Documents/garage/garage/tests/test_algos.py", line 97, in test_polopt_algo
policy = policy_cls(env_spec=env.spec, )
File "/Users/jonathon/Documents/garage/garage/garage/policies/gaussian_gru_policy.py", line 51, in __init__
output_nonlinearity=output_nonlinearity,
File "/Users/jonathon/Documents/garage/garage/garage/core/network.py", line 263, in __init__
b=l_output_flat.b,
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/dense.py", line 78, in __init__
super(DenseLayer, self).__init__(incoming, **kwargs)
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in __init__
if any(d is not None and d <= 0 for d in self.input_shape):
File "/anaconda2/envs/garage/lib/python3.5/site-packages/lasagne/layers/base.py", line 44, in <genexpr>
if any(d is not None and d <= 0 for d in self.input_shape):
TypeError: unorderable types: tuple() <= int()
----------------------------------------------------------------------
Ran 32 tests in 14.779s
FAILED (errors=16)
same as #183
similar with #183