rll / rllab

rllab is a framework for developing and evaluating reinforcement learning algorithms, fully compatible with OpenAI Gym.
Other
2.89k stars 803 forks source link

problem of vpg #45

Open lchenat opened 7 years ago

lchenat commented 7 years ago

I only encountered this problem recently. When I tried to run a script that I had ran several times before I suddenly got into the debugger:

2016-10-09 16:05:12.691857 HKT | [col_vpg_1] itr #286 | fitting baseline... 2016-10-09 16:05:25.779144 HKT | [col_vpg_1] itr #286 | fitted 2016-10-09 16:05:25.789398 HKT | [col_vpg_1] itr #286 | optimizing policy 2016-10-09 16:05:26.128741 HKT | [col_vpg_1] itr #286 | saving snapshot... 2016-10-09 16:05:26.134122 HKT | [col_vpg_1] itr #286 | saved 2016-10-09 16:05:26.135219 HKT | ----------------------- ---------------- 2016-10-09 16:05:26.135341 HKT | Iteration 286 2016-10-09 16:05:26.135428 HKT | AverageDiscountedReturn -0.169555864807 2016-10-09 16:05:26.135505 HKT | AverageReturn -4.70588235294 2016-10-09 16:05:26.135584 HKT | ExplainedVariance [-1.09076318] 2016-10-09 16:05:26.135662 HKT | NumTrajs 17 2016-10-09 16:05:26.135740 HKT | Entropy 1.39721952261 2016-10-09 16:05:26.135813 HKT | Perplexity 4.04394023599 2016-10-09 16:05:26.135893 HKT | StdReturn 23.3584968943 2016-10-09 16:05:26.135972 HKT | MaxReturn 4.0 2016-10-09 16:05:26.136051 HKT | MinReturn -98.0 2016-10-09 16:05:26.136132 HKT | AveragePolicyStd 0.978515148814 2016-10-09 16:05:26.136230 HKT | LossBefore 0.00863600655567 2016-10-09 16:05:26.136317 HKT | LossAfter 0.00624103735871 2016-10-09 16:05:26.136399 HKT | MeanKL 0.00021358694111 2016-10-09 16:05:26.136477 HKT | MaxKL 0.0413081072869 2016-10-09 16:05:26.136556 HKT | ----------------------- ---------------- 0% 100% [##############################] | ETA: 00:00:00 Total time elapsed: 00:00:00

/home/data/lchenat/rllab-master/rllab/misc/special.py(53)explained_variance_1d() 51 if abs(1 - np.var(y - ypred) / (vary + 1e-8)) > 1e5: 52 import ipdb; ipdb.set_trace() ---> 53 return 1 - np.var(y - ypred) / (vary + 1e-8) 54 55

ipdb>

And I found that these lines of code have been deleted from the github version. Does it mean something wrong with the computation?

dementrock commented 7 years ago

Hi,

I think these lines were accidentally added when I was debugging something, and now they are removed (since some of our experiments run on the cloud, we certainly do not want them to pause...)