Closed xombio closed 4 years ago
Correct info on 'log' file in 'data' folder
[2020-01-30 11:38:57,183 PID:4355 INFO search.py run_ray_search] Running ray search for spec reinforce_baseline_cartpole
Hi @xombio, glad to see you're running code from the book. Let's see how we can fix this issue for you.
Your log indicates an error with pytorch/CUDA:
(pid=4389) Stack (most recent call first):
(pid=4389) terminate called after throwing an instance of 'c10::Error'
(pid=4389) what(): CUDA error: initialization error (getDevice at /opt/conda/conda-bld/pytorch_1556653114079/work/c10/cuda/impl/CUDAGuardImpl.h:35)
(pid=4389) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7fcd68190dc5 in /home/joe/anaconda3/envs/lab/lib/python3.7/site-packages/torch/lib/libc10.so)
(pid=4389) frame #1: <unknown function> + 0xca67 (0x7fcd6038ca67 in /home/joe/anaconda3/envs/lab/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
(pid=4389) frame #2: torch::autograd::Engine::thread_init(int) + 0x3ee (0x7fcd60aadb1e in /home/joe/anaconda3/envs/lab/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
(pid=4389) frame #3: torch::autograd::python::PythonEngine::thread_init(int) + 0x2a (0x7fcd9741328a in /home/joe/anaconda3/envs/lab/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
(pid=4389) frame #4: <unknown function> + 0xc8421 (0x7fcdac471421 in /home/joe/anaconda3/envs/lab/bin/../lib/libstdc++.so.6)
(pid=4389) frame #5: <unknown function> + 0x76db (0x7fcdb1cfa6db in /lib/x86_64-linux-gnu/libpthread.so.0)
(pid=4389) frame #6: clone + 0x3f (0x7fcdb1a2388f in /lib/x86_64-linux-gnu/libc.so.6)
Seems like you were using the right command to run, for example: python run_lab.py slm_lab/spec/benchmark/reinforce/reinforce_cartpole.json reinforce_baseline_cartpole search
However, the REINFORCE example shouldn't be using CUDA, so that's a first suspect. Can you please check:
pytorch
installed (find in conda list pytorch
from the lab
environment)cudatoolkit
is installed and if so, the version (find in conda list cuda
from the lab
environment)Cool, the system works. Answers below. I have a GTX 970.
(lab) joe@Gauss:~/SLM-Lab$ conda list pytorch
# packages in environment at /home/joe/anaconda3/envs/lab:
#
# Name Version Build Channel
pytorch 1.1.0 py3.7_cuda10.0.130_cudnn7.5.1_0 pytorch
(lab) joe@Gauss:~/SLM-Lab$ conda list cuda
# packages in environment at /home/joe/anaconda3/envs/lab:
#
# Name Version Build Channel
cudatoolkit 10.0.130 0
(lab) joe@Gauss:~/SLM-Lab$ nvidia-smi
Thu Jan 30 14:19:44 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 970 On | 00000000:01:00.0 On | N/A |
| 0% 38C P8 15W / 170W | 206MiB / 4039MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1321 G /usr/lib/xorg/Xorg 100MiB |
| 0 1456 G /usr/bin/gnome-shell 101MiB |
+-----------------------------------------------------------------------------+
Installation seems okay. The problem is 2 fold:
the particular REINFORCE search should not be calling CUDA at all. Did you set "gpu": True
in the spec? If you can get it to run just on CPU (no CUDA) then that solves part of the problem.
GPU should be working as well, in all modes including train
and search
. Does it work on train
with "gpu": True
? I suspect this is a CUDA-specific problem, in particular:
what(): CUDA error: initialization error (getDevice at /opt/conda/conda-bld/pytorch_1556653114079/work/c10/cuda/impl/CUDAGuardImpl.h:35)
Seems to happen when CUDA is initializing and getting device. In fact, it seems like this is the issue impacting you, with Python 3.7 and CUDA 10: https://github.com/pytorch/pytorch/issues/30900
Before that issue is fixed by PyTorch, I would recommend trying to downgrade to python 3.6:
Edit the line environment.yml
file from - python=3.7.3
to - python=3.6
Create a new conda env lab2
with python 3.6.
conda create -n lab2 python=3.6 -y
conda env update -f environment.yml
Then, try running a task in the new conda env:
conda activate lab2
python run_lab.py slm_lab/spec/benchmark/reinforce/reinforce_cartpole.json reinforce_baseline_cartpole search
Let me know if this works.
Sorry for the delayed responses. Now I get ModuleNotFoundError: No module named 'cv2'
.
I updated the environment.yml
file
name: lab
channels:
- plotly
- pytorch
- conda-forge
- defaults
dependencies:
- autopep8=1.4.4
- colorlog=4.0.2
- coverage=4.5.3
- flaky=3.5.3
- libgcc
- numpy=1.16.3
- openpyxl=2.6.1
- pandas=0.24.2
- pillow=6.2.0
- pip=19.1.1
- plotly-orca=1.2.1
- psutil=5.6.2
- pycodestyle=2.5.0
- pydash=4.2.1
- pytest-cov=2.7.1
- pytest-timeout=1.3.3
- pytest=4.5.0
- python=3.6
- pytorch=1.1.0
- pyyaml=5.1
- regex=2019.05.25
- scipy=1.3.0
- tensorboard=1.14.0
- ujson=1.35
- xlrd=1.2.0
- pip:
- box2d-py==2.3.8
- cloudpickle==0.5.2
- colorlover==0.3.0
- opencv-python==4.1.0.25
- plotly==3.9.0
- pyopengl==3.1.0
- ray==0.7.0
- redis==2.10.6
- xvfbwrapper==0.2.9
- gym==0.12.1
- gym[atari]
- gym[box2d]
- gym[classic_control]
- roboschool==1.0.46
- atari-py
Ran conda create -n lab2 python=3.6 -y
(base) joe@Gauss:~/SLM-Lab$ conda create -n lab2 python=3.6 -y
Collecting package metadata (current_repodata.json): done
Solving environment: done
==> WARNING: A newer version of conda exists. <==
current version: 4.8.1
latest version: 4.8.2
Please update conda by running
$ conda update -n base -c defaults conda
## Package Plan ##
environment location: /home/joe/anaconda3/envs/lab2
added / updated specs:
- python=3.6
The following NEW packages will be INSTALLED:
_libgcc_mutex pkgs/main/linux-64::_libgcc_mutex-0.1-main
ca-certificates pkgs/main/linux-64::ca-certificates-2020.1.1-0
certifi pkgs/main/linux-64::certifi-2019.11.28-py36_0
ld_impl_linux-64 pkgs/main/linux-64::ld_impl_linux-64-2.33.1-h53a641e_7
libedit pkgs/main/linux-64::libedit-3.1.20181209-hc058e9b_0
libffi pkgs/main/linux-64::libffi-3.2.1-hd88cf55_4
libgcc-ng pkgs/main/linux-64::libgcc-ng-9.1.0-hdf63c60_0
libstdcxx-ng pkgs/main/linux-64::libstdcxx-ng-9.1.0-hdf63c60_0
ncurses pkgs/main/linux-64::ncurses-6.1-he6710b0_1
openssl pkgs/main/linux-64::openssl-1.1.1d-h7b6447c_3
pip pkgs/main/linux-64::pip-20.0.2-py36_1
python pkgs/main/linux-64::python-3.6.10-h0371630_0
readline pkgs/main/linux-64::readline-7.0-h7b6447c_5
setuptools pkgs/main/linux-64::setuptools-45.1.0-py36_0
sqlite pkgs/main/linux-64::sqlite-3.30.1-h7b6447c_0
tk pkgs/main/linux-64::tk-8.6.8-hbc83047_0
wheel pkgs/main/linux-64::wheel-0.34.1-py36_0
xz pkgs/main/linux-64::xz-5.2.4-h14c3975_4
zlib pkgs/main/linux-64::zlib-1.2.11-h7b6447c_3
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate lab2
#
# To deactivate an active environment, use
#
# $ conda deactivate
Then ran conda env update -f environment.yml
(base) joe@Gauss:~/SLM-Lab$ conda env update -f environment.yml
Collecting package metadata (repodata.json): done
Solving environment: done
==> WARNING: A newer version of conda exists. <==
current version: 4.8.1
latest version: 4.8.2
Please update conda by running
$ conda update -n base -c defaults conda
Ran pip subprocess with arguments:
['/home/joe/anaconda3/envs/lab/bin/python', '-m', 'pip', 'install', '-U', '-r', '/home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt']
Pip subprocess output:
Requirement already up-to-date: box2d-py==2.3.8 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from -r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 1)) (2.3.8)
Requirement already up-to-date: cloudpickle==0.5.2 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from -r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 2)) (0.5.2)
Requirement already up-to-date: colorlover==0.3.0 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from -r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 3)) (0.3.0)
Requirement already up-to-date: opencv-python==4.1.0.25 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from -r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 4)) (4.1.0.25)
Requirement already up-to-date: plotly==3.9.0 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from -r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (3.9.0)
Requirement already up-to-date: pyopengl==3.1.0 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from -r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 6)) (3.1.0)
Requirement already up-to-date: ray==0.7.0 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from -r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (0.7.0)
Requirement already up-to-date: redis==2.10.6 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from -r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 8)) (2.10.6)
Requirement already up-to-date: xvfbwrapper==0.2.9 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from -r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 9)) (0.2.9)
Requirement already up-to-date: gym==0.12.1 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from -r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 10)) (0.12.1)
Requirement already up-to-date: roboschool==1.0.46 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from -r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 14)) (1.0.46)
Requirement already up-to-date: atari-py in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from -r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 15)) (0.2.6)
Requirement already satisfied, skipping upgrade: numpy>=1.11.3 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from opencv-python==4.1.0.25->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 4)) (1.16.3)
Requirement already satisfied, skipping upgrade: six in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (1.14.0)
Requirement already satisfied, skipping upgrade: nbformat>=4.2 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (5.0.4)
Requirement already satisfied, skipping upgrade: pytz in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (2019.3)
Requirement already satisfied, skipping upgrade: requests in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (2.22.0)
Requirement already satisfied, skipping upgrade: retrying>=1.3.3 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (1.3.3)
Requirement already satisfied, skipping upgrade: decorator>=4.0.6 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (4.4.1)
Requirement already satisfied, skipping upgrade: filelock in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (3.0.12)
Requirement already satisfied, skipping upgrade: colorama in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (0.4.3)
Requirement already satisfied, skipping upgrade: typing in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (3.7.4.1)
Requirement already satisfied, skipping upgrade: click in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (7.0)
Requirement already satisfied, skipping upgrade: pyyaml in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (5.1.2)
Requirement already satisfied, skipping upgrade: pytest in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (4.5.0)
Requirement already satisfied, skipping upgrade: funcsigs in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (1.0.2)
Requirement already satisfied, skipping upgrade: flatbuffers in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (1.11)
Requirement already satisfied, skipping upgrade: pyglet>=1.2.0 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from gym==0.12.1->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 10)) (1.4.10)
Requirement already satisfied, skipping upgrade: scipy in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from gym==0.12.1->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 10)) (1.3.0)
Requirement already satisfied, skipping upgrade: jupyter-core in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from nbformat>=4.2->plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (4.6.1)
Requirement already satisfied, skipping upgrade: ipython-genutils in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from nbformat>=4.2->plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (0.2.0)
Requirement already satisfied, skipping upgrade: jsonschema!=2.5.0,>=2.4 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from nbformat>=4.2->plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (3.2.0)
Requirement already satisfied, skipping upgrade: traitlets>=4.1 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from nbformat>=4.2->plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (4.3.3)
Requirement already satisfied, skipping upgrade: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from requests->plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (1.25.8)
Requirement already satisfied, skipping upgrade: idna<2.9,>=2.5 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from requests->plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (2.8)
Requirement already satisfied, skipping upgrade: chardet<3.1.0,>=3.0.2 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from requests->plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (3.0.4)
Requirement already satisfied, skipping upgrade: certifi>=2017.4.17 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from requests->plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (2019.11.28)
Requirement already satisfied, skipping upgrade: py>=1.5.0 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from pytest->ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (1.8.1)
Requirement already satisfied, skipping upgrade: setuptools in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from pytest->ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (45.1.0.post20200119)
Requirement already satisfied, skipping upgrade: attrs>=17.4.0 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from pytest->ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (19.3.0)
Requirement already satisfied, skipping upgrade: atomicwrites>=1.0 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from pytest->ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (1.3.0)
Requirement already satisfied, skipping upgrade: pluggy!=0.10,<1.0,>=0.9 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from pytest->ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (0.11.0)
Requirement already satisfied, skipping upgrade: wcwidth in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from pytest->ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (0.1.8)
Requirement already satisfied, skipping upgrade: more-itertools>=4.0.0 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from pytest->ray==0.7.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 7)) (8.2.0)
Requirement already satisfied, skipping upgrade: future in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from pyglet>=1.2.0->gym==0.12.1->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 10)) (0.18.2)
Requirement already satisfied, skipping upgrade: pyrsistent>=0.14.0 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2->plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (0.15.7)
Requirement already satisfied, skipping upgrade: importlib-metadata; python_version < "3.8" in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2->plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (1.5.0)
Requirement already satisfied, skipping upgrade: zipp>=0.5 in /home/joe/anaconda3/envs/lab/lib/python3.6/site-packages (from importlib-metadata; python_version < "3.8"->jsonschema!=2.5.0,>=2.4->nbformat>=4.2->plotly==3.9.0->-r /home/joe/SLM-Lab/condaenv.dfyo_84c.requirements.txt (line 5)) (2.1.0)
#
# To activate this environment, use
#
# $ conda activate lab
#
# To deactivate an active environment, use
#
# $ conda deactivate
And finally,
(base) joe@Gauss:~/SLM-Lab$ conda activate lab2
(lab2) joe@Gauss:~/SLM-Lab$ python run_lab.py slm_lab/spec/benchmark/reinforce/reinforce_cartpole.json reinforce_baseline_cartpole search
Traceback (most recent call last):
File "run_lab.py", line 8, in <module>
from slm_lab.experiment import search
File "/home/joe/SLM-Lab/slm_lab/experiment/search.py", line 2, in <module>
from slm_lab.lib import logger, util
File "/home/joe/SLM-Lab/slm_lab/lib/logger.py", line 1, in <module>
from slm_lab.lib import util
File "/home/joe/SLM-Lab/slm_lab/lib/util.py", line 7, in <module>
import cv2
ModuleNotFoundError: No module named 'cv2'
The ModuleNotFoundError
indicates the conda environment not actually getting set up. Indeed your setup commands were run from within another conda environment base
. Conda can't handle nestedness. You'd need to clear that environment and retry from outside of the base
environment (if this activates by default u can edit your bashprofile to prevent it.
Additionally, also make sure the first line of your environment file says name: lab2
instead of name: lab
.
Here's how you could reinstall:
# get out of the (base) environment
conda deactivate
conda env remove -n lab2
conda create -n lab2 python=3.6 -y
# with the updated environment.yml file specifying the new env lab2
conda env update -f environment.yml
conda activate lab2
# run your python commands
This is getting old. I removed the SLM-Lab, ran conda deactivate
to get rid of the (base) environment and installed SLM-Lab again. I didn't modify any specs. I followed all the instructions including naming the environment file name: lab2
and this is what I got...
(lab2) joe@Gauss:~/SLM-Lab$ python run_lab.py slm_lab/spec/benchmark/reinforce/reinforce_cartpole.json reinforce_baseline_cartpole search
[2020-02-05 15:50:21,639 PID:21541 INFO run_lab.py read_spec_and_run] Running lab spec_file:slm_lab/spec/benchmark/reinforce/reinforce_cartpole.json spec_name:reinforce_baseline_cartpole in mode:search
[2020-02-05 15:50:21,645 PID:21541 INFO search.py run_ray_search] Running ray search for spec reinforce_baseline_cartpole
2020-02-05 15:50:21,645 WARNING worker.py:1341 -- WARNING: Not updating worker name since `setproctitle` is not installed. Install this with `pip install setproctitle` (or ray[debug]) to enable monitoring of worker processes.
2020-02-05 15:50:21,645 INFO node.py:497 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2020-02-05_15-50-21_645705_21541/logs.
2020-02-05 15:50:21,749 INFO services.py:409 -- Waiting for redis server at 127.0.0.1:50454 to respond...
2020-02-05 15:50:21,856 INFO services.py:409 -- Waiting for redis server at 127.0.0.1:23567 to respond...
2020-02-05 15:50:21,861 INFO services.py:806 -- Starting Redis shard with 3.35 GB max memory.
2020-02-05 15:50:21,886 INFO node.py:511 -- Process STDOUT and STDERR is being redirected to /tmp/ray/session_2020-02-05_15-50-21_645705_21541/logs.
2020-02-05 15:50:21,886 INFO services.py:1441 -- Starting the Plasma object store with 5.02 GB memory using /dev/shm.
2020-02-05 15:50:21,978 INFO tune.py:60 -- Tip: to resume incomplete experiments, pass resume='prompt' or resume=True to run()
2020-02-05 15:50:21,978 INFO tune.py:223 -- Starting a new experiment.
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/1 GPUs
Memory usage on this node: 2.3/16.7 GB
2020-02-05 15:50:22,003 WARNING logger.py:130 -- Couldn't import TensorFlow - disabling TensorBoard logging.
2020-02-05 15:50:22,004 WARNING logger.py:224 -- Could not instantiate <class 'ray.tune.logger.TFLogger'> - skipping.
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 4/8 CPUs, 0/1 GPUs
Memory usage on this node: 2.4/16.7 GB
Result logdir: /home/joe/ray_results/reinforce_baseline_cartpole
Number of trials: 2 ({'RUNNING': 1, 'PENDING': 1})
PENDING trials:
- ray_trainable_1_agent.0.algorithm.center_return=False,trial_index=1: PENDING
RUNNING trials:
- ray_trainable_0_agent.0.algorithm.center_return=True,trial_index=0: RUNNING
2020-02-05 15:50:22,031 WARNING logger.py:130 -- Couldn't import TensorFlow - disabling TensorBoard logging.
2020-02-05 15:50:22,031 WARNING logger.py:224 -- Could not instantiate <class 'ray.tune.logger.TFLogger'> - skipping.
(pid=21574) [2020-02-05 15:50:22,860 PID:21574 INFO logger.py info] Running sessions
(pid=21574) [2020-02-05 15:50:22,904 PID:21649 INFO openai.py __init__] OpenAIEnv:
(pid=21574) - env_spec = {'max_frame': 100000, 'max_t': None, 'name': 'CartPole-v0'}
(pid=21574) - eval_frequency = 2000
(pid=21574) - log_frequency = 10000
(pid=21574) - frame_op = None
(pid=21574) - frame_op_len = None
(pid=21574) - image_downsize = (84, 84)
(pid=21574) - normalize_state = False
(pid=21574) - reward_scale = None
(pid=21574) - num_envs = 1
(pid=21574) - name = CartPole-v0
(pid=21574) - max_t = 200
(pid=21574) - max_frame = 100000
(pid=21574) - to_render = False
(pid=21574) - is_venv = False
(pid=21574) - clock_speed = 1
(pid=21574) - clock = <slm_lab.env.base.Clock object at 0x7fd478962358>
(pid=21574) - done = False
(pid=21574) - total_reward = nan
(pid=21574) - u_env = <TrackReward<TimeLimit<CartPoleEnv<CartPole-v0>>>>
(pid=21574) - observation_space = Box(4,)
(pid=21574) - action_space = Discrete(2)
(pid=21574) - observable_dim = {'state': 4}
(pid=21574) - action_dim = 2
(pid=21574) - is_discrete = True
(pid=21574) [2020-02-05 15:50:22,905 PID:21647 INFO openai.py __init__][2020-02-05 15:50:22,905 PID:21653 INFO openai.py __init__] OpenAIEnv:
(pid=21574) - env_spec = {'max_frame': 100000, 'max_t': None, 'name': 'CartPole-v0'}
(pid=21574) - eval_frequency = 2000
(pid=21574) - log_frequency = 10000
(pid=21574) - frame_op = None
(pid=21574) - frame_op_len = None
(pid=21574) - image_downsize = (84, 84)
(pid=21574) - normalize_state = False
(pid=21574) - reward_scale = None
(pid=21574) - num_envs = 1
(pid=21574) - name = CartPole-v0
(pid=21574) - max_t = 200
(pid=21574) - max_frame = 100000
(pid=21574) - to_render = False
(pid=21574) - is_venv = False
(pid=21574) - clock_speed = 1
(pid=21574) - clock = <slm_lab.env.base.Clock object at 0x7fd478962358>
(pid=21574) - done = False
(pid=21574) - total_reward = nan
(pid=21574) - u_env = <TrackReward<TimeLimit<CartPoleEnv<CartPole-v0>>>>
(pid=21574) - observation_space = Box(4,)
(pid=21574) - action_space = Discrete(2)
(pid=21574) - observable_dim = {'state': 4}
(pid=21574) - action_dim = 2
(pid=21574) - is_discrete = True OpenAIEnv:
(pid=21574) - env_spec = {'max_frame': 100000, 'max_t': None, 'name': 'CartPole-v0'}
(pid=21574) - eval_frequency = 2000
(pid=21574) - log_frequency = 10000
(pid=21574) - frame_op = None
(pid=21574) - frame_op_len = None
(pid=21574) - image_downsize = (84, 84)
(pid=21574) - normalize_state = False
(pid=21574) - reward_scale = None
(pid=21574) - num_envs = 1
(pid=21574) - name = CartPole-v0
(pid=21574) - max_t = 200
(pid=21574) - max_frame = 100000
(pid=21574) - to_render = False
(pid=21574) - is_venv = False
(pid=21574) - clock_speed = 1
(pid=21574) - clock = <slm_lab.env.base.Clock object at 0x7fd478962358>
(pid=21574) - done = False
(pid=21574) - total_reward = nan
(pid=21574) - u_env = <TrackReward<TimeLimit<CartPoleEnv<CartPole-v0>>>>
(pid=21574) - observation_space = Box(4,)
(pid=21574) - action_space = Discrete(2)
(pid=21574) - observable_dim = {'state': 4}
(pid=21574) - action_dim = 2
(pid=21574) - is_discrete = True
(pid=21574)
(pid=21574) [2020-02-05 15:50:22,906 PID:21651 INFO openai.py __init__] OpenAIEnv:
(pid=21574) - env_spec = {'max_frame': 100000, 'max_t': None, 'name': 'CartPole-v0'}
(pid=21574) - eval_frequency = 2000
(pid=21574) - log_frequency = 10000
(pid=21574) - frame_op = None
(pid=21574) - frame_op_len = None
(pid=21574) - image_downsize = (84, 84)
(pid=21574) - normalize_state = False
(pid=21574) - reward_scale = None
(pid=21574) - num_envs = 1
(pid=21574) - name = CartPole-v0
(pid=21574) - max_t = 200
(pid=21574) - max_frame = 100000
(pid=21574) - to_render = False
(pid=21574) - is_venv = False
(pid=21574) - clock_speed = 1
(pid=21574) - clock = <slm_lab.env.base.Clock object at 0x7fd478962358>
(pid=21574) - done = False
(pid=21574) - total_reward = nan
(pid=21574) - u_env = <TrackReward<TimeLimit<CartPoleEnv<CartPole-v0>>>>
(pid=21574) - observation_space = Box(4,)
(pid=21574) - action_space = Discrete(2)
(pid=21574) - observable_dim = {'state': 4}
(pid=21574) - action_dim = 2
(pid=21576) [2020-02-05 15:50:22,905 PID:21576 INFO logger.py info] Running sessions
(pid=21574) - is_discrete = True
(pid=21574) [2020-02-05 15:50:22,928 PID:21647 INFO base.py post_init_nets][2020-02-05 15:50:22,928 PID:21653 INFO base.py post_init_nets][2020-02-05 15:50:22,928 PID:21651 INFO base.py post_init_nets][2020-02-05 15:50:22,928 PID:21649 INFO base.py post_init_nets] Initialized algorithm models for lab_mode: search Initialized algorithm models for lab_mode: search Initialized algorithm models for lab_mode: search Initialized algorithm models for lab_mode: search
(pid=21574)
(pid=21574)
(pid=21574)
(pid=21574) [2020-02-05 15:50:22,933 PID:21653 INFO base.py __init__][2020-02-05 15:50:22,933 PID:21651 INFO base.py __init__] Reinforce:
(pid=21574) - agent = <slm_lab.agent.Agent object at 0x7fd3401bd3c8>
(pid=21574) - algorithm_spec = {'action_pdtype': 'default',
(pid=21574) 'action_policy': 'default',
(pid=21574) 'center_return': True,
(pid=21574) 'entropy_coef_spec': {'end_step': 20000,
(pid=21574) 'end_val': 0.001,
(pid=21574) 'name': 'linear_decay',
(pid=21574) 'start_step': 0,
(pid=21574) 'start_val': 0.01},
(pid=21574) 'explore_var_spec': None,
(pid=21574) 'gamma': 0.99,
(pid=21574) 'name': 'Reinforce',
(pid=21574) 'training_frequency': 1}
(pid=21574) - name = Reinforce
(pid=21574) - memory_spec = {'name': 'OnPolicyReplay'}
(pid=21574) - net_spec = {'clip_grad_val': None,
(pid=21574) 'hid_layers': [64],
(pid=21574) 'hid_layers_activation': 'selu',
(pid=21574) 'loss_spec': {'name': 'MSELoss'},
(pid=21574) 'lr_scheduler_spec': None,
(pid=21574) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21574) 'type': 'MLPNet'}
(pid=21574) - body = body: {
(pid=21574) "agent": "<slm_lab.agent.Agent object at 0x7fd3401bd3c8>",
(pid=21574) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>",
(pid=21574) "a": 0,
(pid=21574) "e": 0,
(pid=21574) "b": 0,
(pid=21574) "aeb": "(0, 0, 0)",
(pid=21574) "explore_var": NaN,
(pid=21574) "entropy_coef": 0.01,
(pid=21574) "loss": NaN,
(pid=21574) "mean_entropy": NaN,
(pid=21574) "mean_grad_norm": NaN,
(pid=21574) "best_total_reward_ma": -Infinity,
(pid=21574) "total_reward_ma": NaN,
(pid=21574) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7fd34019ce80>",
(pid=21574) "tb_actions": [],
(pid=21574) "tb_tracker": {},
(pid=21574) "observation_space": "Box(4,)",
(pid=21574) "action_space": "Discrete(2)",
(pid=21574) "observable_dim": {
(pid=21574) "state": 4
(pid=21574) },
(pid=21574) "state_dim": 4,
(pid=21574) "action_dim": 2,
(pid=21574) "is_discrete": true,
(pid=21574) "action_type": "discrete",
(pid=21574) "action_pdtype": "Categorical",
(pid=21574) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21574) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7fd3401bd470>"
(pid=21574) }
(pid=21574) - action_pdtype = default
(pid=21574) - action_policy = <function default at 0x7fd34cb83f28>
(pid=21574) - center_return = True
(pid=21574) - explore_var_spec = None
(pid=21574) - entropy_coef_spec = {'end_step': 20000,
(pid=21574) 'end_val': 0.001,
(pid=21574) 'name': 'linear_decay',
(pid=21574) 'start_step': 0,
(pid=21574) 'start_val': 0.01}
(pid=21574) - policy_loss_coef = 1.0
(pid=21574) - gamma = 0.99
(pid=21574) - training_frequency = 1
(pid=21574) - to_train = 0
(pid=21574) - explore_var_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7fd3401bd438>
(pid=21574) - entropy_coef_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7fd3401bd518>
(pid=21574) - net = MLPNet(
(pid=21574) (model): Sequential(
(pid=21574) (0): Linear(in_features=4, out_features=64, bias=True)
(pid=21574) (1): SELU()
(pid=21574) )
(pid=21574) (model_tail): Sequential(
(pid=21574) (0): Linear(in_features=64, out_features=2, bias=True)
(pid=21574) )
(pid=21574) (loss_fn): MSELoss()
(pid=21574) )
(pid=21574) - net_names = ['net']
(pid=21574) - optim = Adam (
(pid=21574) Parameter Group 0
(pid=21574) amsgrad: False
(pid=21574) betas: (0.9, 0.999)
(pid=21574) eps: 1e-08
(pid=21574) lr: 0.002
(pid=21574) weight_decay: 0
(pid=21574) )
(pid=21574) - lr_scheduler = <slm_lab.agent.net.net_util.NoOpLRScheduler object at 0x7fd3401bd7f0>
(pid=21574) - global_net = None
(pid=21574) [2020-02-05 15:50:22,933 PID:21649 INFO base.py __init__] Reinforce:
(pid=21574) - agent = <slm_lab.agent.Agent object at 0x7fd34013b400>
(pid=21574) - algorithm_spec = {'action_pdtype': 'default',
(pid=21574) 'action_policy': 'default',
(pid=21574) 'center_return': True,
(pid=21576) [2020-02-05 15:50:22,928 PID:21664 INFO openai.py __init__] OpenAIEnv:
(pid=21576) - env_spec = {'max_frame': 100000, 'max_t': None, 'name': 'CartPole-v0'}
(pid=21576) - eval_frequency = 2000
(pid=21576) - log_frequency = 10000
(pid=21576) - frame_op = None
(pid=21576) - frame_op_len = None
(pid=21576) - image_downsize = (84, 84)
(pid=21576) - normalize_state = False
(pid=21576) - reward_scale = None
(pid=21576) - num_envs = 1
(pid=21576) - name = CartPole-v0
(pid=21576) - max_t = 200
(pid=21576) - max_frame = 100000
(pid=21576) - to_render = False
(pid=21576) - is_venv = False
(pid=21576) - clock_speed = 1
(pid=21576) - clock = <slm_lab.env.base.Clock object at 0x7f7535f60320>
(pid=21576) - done = False
(pid=21576) - total_reward = nan
(pid=21576) - u_env = <TrackReward<TimeLimit<CartPoleEnv<CartPole-v0>>>>
(pid=21576) - observation_space = Box(4,)
(pid=21576) - action_space = Discrete(2)
(pid=21576) - observable_dim = {'state': 4}
(pid=21576) - action_dim = 2
(pid=21576) - is_discrete = True
(pid=21576) [2020-02-05 15:50:22,933 PID:21666 INFO openai.py __init__] OpenAIEnv:
(pid=21576) - env_spec = {'max_frame': 100000, 'max_t': None, 'name': 'CartPole-v0'}
(pid=21576) - eval_frequency = 2000
(pid=21576) - log_frequency = 10000
(pid=21576) - frame_op = None
(pid=21576) - frame_op_len = None
(pid=21576) - image_downsize = (84, 84)
(pid=21576) - normalize_state = False
(pid=21576) - reward_scale = None
(pid=21576) - num_envs = 1
(pid=21576) - name = CartPole-v0
(pid=21576) - max_t = 200
(pid=21576) - max_frame = 100000
(pid=21576) - to_render = False
(pid=21576) - is_venv = False
(pid=21576) - clock_speed = 1
(pid=21576) - clock = <slm_lab.env.base.Clock object at 0x7f7535f60320>
(pid=21576) - done = False
(pid=21576) - total_reward = nan
(pid=21576) - u_env = <TrackReward<TimeLimit<CartPoleEnv<CartPole-v0>>>>
(pid=21576) - observation_space = Box(4,)
(pid=21576) - action_space = Discrete(2)
(pid=21576) - observable_dim = {'state': 4}
(pid=21576) - action_dim = 2
(pid=21576) - is_discrete = True
(pid=21576) [2020-02-05 15:50:22,935 PID:21668 INFO openai.py __init__] OpenAIEnv:
(pid=21576) - env_spec = {'max_frame': 100000, 'max_t': None, 'name': 'CartPole-v0'}
(pid=21576) - eval_frequency = 2000
(pid=21576) - log_frequency = 10000
(pid=21576) - frame_op = None
(pid=21576) - frame_op_len = None
(pid=21576) - image_downsize = (84, 84)
(pid=21576) - normalize_state = False
(pid=21576) - reward_scale = None
(pid=21576) - num_envs = 1
(pid=21576) - name = CartPole-v0
(pid=21576) - max_t = 200
(pid=21576) - max_frame = 100000
(pid=21576) - to_render = False
(pid=21576) - is_venv = False
(pid=21576) - clock_speed = 1
(pid=21576) - clock = <slm_lab.env.base.Clock object at 0x7f7535f60320>
(pid=21576) - done = False
(pid=21576) - total_reward = nan
(pid=21576) - u_env = <TrackReward<TimeLimit<CartPoleEnv<CartPole-v0>>>>
(pid=21576) - observation_space = Box(4,)
(pid=21576) - action_space = Discrete(2)
(pid=21576) - observable_dim = {'state': 4}
(pid=21576) - action_dim = 2
(pid=21576) - is_discrete = True
(pid=21576) [2020-02-05 15:50:22,937 PID:21670 INFO openai.py __init__] OpenAIEnv:
(pid=21576) - env_spec = {'max_frame': 100000, 'max_t': None, 'name': 'CartPole-v0'}
(pid=21576) - eval_frequency = 2000
(pid=21576) - log_frequency = 10000
(pid=21576) - frame_op = None
(pid=21576) - frame_op_len = None
(pid=21576) - image_downsize = (84, 84)
(pid=21576) - normalize_state = False
(pid=21576) - reward_scale = None
(pid=21576) - num_envs = 1
(pid=21576) - name = CartPole-v0
(pid=21576) - max_t = 200
(pid=21576) - max_frame = 100000
(pid=21576) - to_render = False
(pid=21576) - is_venv = False
(pid=21576) - clock_speed = 1
(pid=21576) - clock = <slm_lab.env.base.Clock object at 0x7f7535f60320>
(pid=21576) - done = False
(pid=21576) - total_reward = nan
(pid=21576) - u_env = <TrackReward<TimeLimit<CartPoleEnv<CartPole-v0>>>>
(pid=21576) - observation_space = Box(4,)
(pid=21576) - action_space = Discrete(2)
(pid=21576) - observable_dim = {'state': 4}
(pid=21576) - action_dim = 2
(pid=21576) - is_discrete = True
(pid=21574) 'entropy_coef_spec': {'end_step': 20000,
(pid=21574) 'end_val': 0.001,
(pid=21574) 'name': 'linear_decay',
(pid=21574) 'start_step': 0,
(pid=21574) 'start_val': 0.01},
(pid=21574) 'explore_var_spec': None,
(pid=21574) 'gamma': 0.99,
(pid=21574) 'name': 'Reinforce',
(pid=21574) 'training_frequency': 1}
(pid=21574) - name = Reinforce
(pid=21574) - memory_spec = {'name': 'OnPolicyReplay'}
(pid=21574) - net_spec = {'clip_grad_val': None,
(pid=21574) 'hid_layers': [64],
(pid=21574) 'hid_layers_activation': 'selu',
(pid=21574) 'loss_spec': {'name': 'MSELoss'},
(pid=21574) 'lr_scheduler_spec': None,
(pid=21574) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21574) 'type': 'MLPNet'}
(pid=21574) - body = body: {
(pid=21574) "agent": "<slm_lab.agent.Agent object at 0x7fd34013b400>",
(pid=21574) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>",
(pid=21574) "a": 0,
(pid=21574) "e": 0,
(pid=21574) "b": 0,
(pid=21574) "aeb": "(0, 0, 0)",
(pid=21574) "explore_var": NaN,
(pid=21574) "entropy_coef": 0.01,
(pid=21574) "loss": NaN,
(pid=21574) "mean_entropy": NaN,
(pid=21574) "mean_grad_norm": NaN,
(pid=21574) "best_total_reward_ma": -Infinity,
(pid=21574) "total_reward_ma": NaN,
(pid=21574) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7fd34011beb8>",
(pid=21574) "tb_actions": [],
(pid=21574) "tb_tracker": {},
(pid=21574) "observation_space": "Box(4,)",
(pid=21574) "action_space": "Discrete(2)",
(pid=21574) "observable_dim": {
(pid=21574) "state": 4
(pid=21574) },
(pid=21574) "state_dim": 4,
(pid=21574) "action_dim": 2,
(pid=21574) "is_discrete": true,
(pid=21574) "action_type": "discrete",
(pid=21574) "action_pdtype": "Categorical",
(pid=21574) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21574) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7fd34013b4a8>"
(pid=21574) }
(pid=21574) - action_pdtype = default
(pid=21574) - action_policy = <function default at 0x7fd34cb83f28>
(pid=21574) - center_return = True
(pid=21574) - explore_var_spec = None
(pid=21574) - entropy_coef_spec = {'end_step': 20000,
(pid=21574) 'end_val': 0.001,
(pid=21574) 'name': 'linear_decay',
(pid=21574) 'start_step': 0,
(pid=21574) 'start_val': 0.01}
(pid=21574) - policy_loss_coef = 1.0
(pid=21574) - gamma = 0.99
(pid=21574) - training_frequency = 1
(pid=21574) - to_train = 0
(pid=21574) - explore_var_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7fd34013b470>
(pid=21574) - entropy_coef_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7fd34013b550>
(pid=21574) - net = MLPNet(
(pid=21574) (model): Sequential(
(pid=21574) (0): Linear(in_features=4, out_features=64, bias=True)
(pid=21574) (1): SELU()
(pid=21574) )
(pid=21574) (model_tail): Sequential(
(pid=21574) (0): Linear(in_features=64, out_features=2, bias=True)
(pid=21574) )
(pid=21574) (loss_fn): MSELoss()
(pid=21574) )
(pid=21574) - net_names = ['net']
(pid=21574) - optim = Adam (
(pid=21574) Parameter Group 0
(pid=21574) amsgrad: False
(pid=21574) betas: (0.9, 0.999)
(pid=21574) eps: 1e-08
(pid=21574) lr: 0.002
(pid=21574) weight_decay: 0
(pid=21574) )
(pid=21574) - lr_scheduler = <slm_lab.agent.net.net_util.NoOpLRScheduler object at 0x7fd34013b828>
(pid=21574) - global_net = None
(pid=21574) Reinforce:
(pid=21574) - agent = <slm_lab.agent.Agent object at 0x7fd34013a438>
(pid=21574) - algorithm_spec = {'action_pdtype': 'default',
(pid=21574) 'action_policy': 'default',
(pid=21574) 'center_return': True,
(pid=21574) 'entropy_coef_spec': {'end_step': 20000,
(pid=21574) 'end_val': 0.001,
(pid=21574) 'name': 'linear_decay',
(pid=21574) 'start_step': 0,
(pid=21574) 'start_val': 0.01},
(pid=21574) 'explore_var_spec': None,
(pid=21574) 'gamma': 0.99,
(pid=21574) 'name': 'Reinforce',
(pid=21574) 'training_frequency': 1}
(pid=21576) [2020-02-05 15:50:22,948 PID:21664 INFO base.py post_init_nets] Initialized algorithm models for lab_mode: search
(pid=21576) [2020-02-05 15:50:22,948 PID:21666 INFO base.py post_init_nets] Initialized algorithm models for lab_mode: search
(pid=21576) [2020-02-05 15:50:22,949 PID:21668 INFO base.py post_init_nets] Initialized algorithm models for lab_mode: search
(pid=21576) [2020-02-05 15:50:22,951 PID:21670 INFO base.py post_init_nets] Initialized algorithm models for lab_mode: search
(pid=21576) [2020-02-05 15:50:22,953 PID:21664 INFO base.py __init__] Reinforce:
(pid=21576) - agent = <slm_lab.agent.Agent object at 0x7f73ed7a01d0>
(pid=21576) - algorithm_spec = {'action_pdtype': 'default',
(pid=21576) 'action_policy': 'default',
(pid=21576) 'center_return': False,
(pid=21576) 'entropy_coef_spec': {'end_step': 20000,
(pid=21576) 'end_val': 0.001,
(pid=21576) 'name': 'linear_decay',
(pid=21576) 'start_step': 0,
(pid=21576) 'start_val': 0.01},
(pid=21576) 'explore_var_spec': None,
(pid=21576) 'gamma': 0.99,
(pid=21576) 'name': 'Reinforce',
(pid=21576) 'training_frequency': 1}
(pid=21576) - name = Reinforce
(pid=21576) - memory_spec = {'name': 'OnPolicyReplay'}
(pid=21576) - net_spec = {'clip_grad_val': None,
(pid=21576) 'hid_layers': [64],
(pid=21576) 'hid_layers_activation': 'selu',
(pid=21576) 'loss_spec': {'name': 'MSELoss'},
(pid=21576) 'lr_scheduler_spec': None,
(pid=21576) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21576) 'type': 'MLPNet'}
(pid=21576) - body = body: {
(pid=21576) "agent": "<slm_lab.agent.Agent object at 0x7f73ed7a01d0>",
(pid=21576) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>",
(pid=21576) "a": 0,
(pid=21576) "e": 0,
(pid=21576) "b": 0,
(pid=21576) "aeb": "(0, 0, 0)",
(pid=21576) "explore_var": NaN,
(pid=21576) "entropy_coef": 0.01,
(pid=21576) "loss": NaN,
(pid=21576) "mean_entropy": NaN,
(pid=21576) "mean_grad_norm": NaN,
(pid=21576) "best_total_reward_ma": -Infinity,
(pid=21576) "total_reward_ma": NaN,
(pid=21576) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7f73ed779e10>",
(pid=21576) "tb_actions": [],
(pid=21576) "tb_tracker": {},
(pid=21576) "observation_space": "Box(4,)",
(pid=21576) "action_space": "Discrete(2)",
(pid=21576) "observable_dim": {
(pid=21576) "state": 4
(pid=21576) },
(pid=21576) "state_dim": 4,
(pid=21576) "action_dim": 2,
(pid=21576) "is_discrete": true,
(pid=21576) "action_type": "discrete",
(pid=21576) "action_pdtype": "Categorical",
(pid=21576) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21576) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7f73ed7a0278>"
(pid=21576) }
(pid=21576) - action_pdtype = default
(pid=21576) - action_policy = <function default at 0x7f740a17ff28>
(pid=21576) - center_return = False
(pid=21576) - explore_var_spec = None
(pid=21576) - entropy_coef_spec = {'end_step': 20000,
(pid=21576) 'end_val': 0.001,
(pid=21576) 'name': 'linear_decay',
(pid=21576) 'start_step': 0,
(pid=21576) 'start_val': 0.01}
(pid=21576) - policy_loss_coef = 1.0
(pid=21576) - gamma = 0.99
(pid=21576) - training_frequency = 1
(pid=21576) - to_train = 0
(pid=21576) - explore_var_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7f73ed7a0240>
(pid=21576) - entropy_coef_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7f73ed7a0320>
(pid=21576) - net = MLPNet(
(pid=21576) (model): Sequential(
(pid=21576) (0): Linear(in_features=4, out_features=64, bias=True)
(pid=21576) (1): SELU()
(pid=21576) )
(pid=21576) (model_tail): Sequential(
(pid=21576) (0): Linear(in_features=64, out_features=2, bias=True)
(pid=21576) )
(pid=21576) (loss_fn): MSELoss()
(pid=21576) )
(pid=21576) - net_names = ['net']
(pid=21576) - optim = Adam (
(pid=21576) Parameter Group 0
(pid=21576) amsgrad: False
(pid=21576) betas: (0.9, 0.999)
(pid=21576) eps: 1e-08
(pid=21576) lr: 0.002
(pid=21576) weight_decay: 0
(pid=21576) )
(pid=21576) - lr_scheduler = <slm_lab.agent.net.net_util.NoOpLRScheduler object at 0x7f73ed7a05f8>
(pid=21576) - global_net = None
(pid=21576) [2020-02-05 15:50:22,954 PID:21666 INFO base.py __init__] Reinforce:
(pid=21576) - agent = <slm_lab.agent.Agent object at 0x7f73ed7a0278>
(pid=21576) - algorithm_spec = {'action_pdtype': 'default',
(pid=21576) 'action_policy': 'default',
(pid=21576) 'center_return': False,
(pid=21574) - name = Reinforce
(pid=21574) - memory_spec = {'name': 'OnPolicyReplay'}
(pid=21574) - net_spec = {'clip_grad_val': None,
(pid=21574) 'hid_layers': [64],
(pid=21574) 'hid_layers_activation': 'selu',
(pid=21574) 'loss_spec': {'name': 'MSELoss'},
(pid=21574) 'lr_scheduler_spec': None,
(pid=21574) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21574) 'type': 'MLPNet'}
(pid=21574) - body = body: {
(pid=21574) "agent": "<slm_lab.agent.Agent object at 0x7fd34013a438>",
(pid=21574) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>",
(pid=21574) "a": 0,
(pid=21574) "e": 0,
(pid=21574) "b": 0,
(pid=21574) "aeb": "(0, 0, 0)",
(pid=21574) "explore_var": NaN,
(pid=21574) "entropy_coef": 0.01,
(pid=21574) "loss": NaN,
(pid=21574) "mean_entropy": NaN,
(pid=21574) "mean_grad_norm": NaN,
(pid=21574) "best_total_reward_ma": -Infinity,
(pid=21574) "total_reward_ma": NaN,
(pid=21574) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7fd34011aef0>",
(pid=21574) "tb_actions": [],
(pid=21574) "tb_tracker": {},
(pid=21574) "observation_space": "Box(4,)",
(pid=21574) "action_space": "Discrete(2)",
(pid=21574) "observable_dim": {
(pid=21574) "state": 4
(pid=21574) },
(pid=21574) "state_dim": 4,
(pid=21574) "action_dim": 2,
(pid=21574) "is_discrete": true,
(pid=21574) "action_type": "discrete",
(pid=21574) "action_pdtype": "Categorical",
(pid=21574) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21574) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7fd34013a4e0>"
(pid=21574) }
(pid=21574) - action_pdtype = default
(pid=21574) - action_policy = <function default at 0x7fd34cb83f28>
(pid=21574) - center_return = True
(pid=21574) - explore_var_spec = None
(pid=21574) - entropy_coef_spec = {'end_step': 20000,
(pid=21574) 'end_val': 0.001,
(pid=21574) 'name': 'linear_decay',
(pid=21574) 'start_step': 0,
(pid=21574) 'start_val': 0.01}
(pid=21574) - policy_loss_coef = 1.0
(pid=21574) - gamma = 0.99
(pid=21574) - training_frequency = 1
(pid=21574) - to_train = 0
(pid=21574) - explore_var_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7fd34013a4a8>
(pid=21574) - entropy_coef_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7fd34013a588>
(pid=21574) - net = MLPNet(
(pid=21574) (model): Sequential(
(pid=21574) (0): Linear(in_features=4, out_features=64, bias=True)
(pid=21574) (1): SELU()
(pid=21574) )
(pid=21574) (model_tail): Sequential(
(pid=21574) (0): Linear(in_features=64, out_features=2, bias=True)
(pid=21574) )
(pid=21574) (loss_fn): MSELoss()
(pid=21574) )
(pid=21574) - net_names = ['net']
(pid=21574) - optim = Adam (
(pid=21574) Parameter Group 0
(pid=21574) amsgrad: False
(pid=21574) betas: (0.9, 0.999)
(pid=21574) eps: 1e-08
(pid=21574) lr: 0.002
(pid=21574) weight_decay: 0
(pid=21574) )
(pid=21574) - lr_scheduler = <slm_lab.agent.net.net_util.NoOpLRScheduler object at 0x7fd34013a860>
(pid=21574) - global_net = None
(pid=21574) [2020-02-05 15:50:22,934 PID:21647 INFO base.py __init__] Reinforce:
(pid=21574) - agent = <slm_lab.agent.Agent object at 0x7fd34013b390>
(pid=21574) - algorithm_spec = {'action_pdtype': 'default',
(pid=21574) 'action_policy': 'default',
(pid=21574) 'center_return': True,
(pid=21574) 'entropy_coef_spec': {'end_step': 20000,
(pid=21574) 'end_val': 0.001,
(pid=21574) 'name': 'linear_decay',
(pid=21574) 'start_step': 0,
(pid=21574) 'start_val': 0.01},
(pid=21574) 'explore_var_spec': None,
(pid=21574) 'gamma': 0.99,
(pid=21574) 'name': 'Reinforce',
(pid=21574) 'training_frequency': 1}
(pid=21574) - name = Reinforce
(pid=21574) - memory_spec = {'name': 'OnPolicyReplay'}
(pid=21574) - net_spec = {'clip_grad_val': None,
(pid=21574) 'hid_layers': [64],
(pid=21574) 'hid_layers_activation': 'selu',
(pid=21574) 'loss_spec': {'name': 'MSELoss'},
(pid=21574) 'lr_scheduler_spec': None,
(pid=21574) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21574) 'type': 'MLPNet'}
(pid=21576) 'entropy_coef_spec': {'end_step': 20000,
(pid=21576) 'end_val': 0.001,
(pid=21576) 'name': 'linear_decay',
(pid=21576) 'start_step': 0,
(pid=21576) 'start_val': 0.01},
(pid=21576) 'explore_var_spec': None,
(pid=21576) 'gamma': 0.99,
(pid=21576) 'name': 'Reinforce',
(pid=21576) 'training_frequency': 1}
(pid=21576) - name = Reinforce
(pid=21576) - memory_spec = {'name': 'OnPolicyReplay'}
(pid=21576) - net_spec = {'clip_grad_val': None,
(pid=21576) 'hid_layers': [64],
(pid=21576) 'hid_layers_activation': 'selu',
(pid=21576) 'loss_spec': {'name': 'MSELoss'},
(pid=21576) 'lr_scheduler_spec': None,
(pid=21576) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21576) 'type': 'MLPNet'}
(pid=21576) - body = body: {
(pid=21576) "agent": "<slm_lab.agent.Agent object at 0x7f73ed7a0278>",
(pid=21576) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>",
(pid=21576) "a": 0,
(pid=21576) "e": 0,
(pid=21576) "b": 0,
(pid=21576) "aeb": "(0, 0, 0)",
(pid=21576) "explore_var": NaN,
(pid=21576) "entropy_coef": 0.01,
(pid=21576) "loss": NaN,
(pid=21576) "mean_entropy": NaN,
(pid=21576) "mean_grad_norm": NaN,
(pid=21576) "best_total_reward_ma": -Infinity,
(pid=21576) "total_reward_ma": NaN,
(pid=21576) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7f73ed77aeb8>",
(pid=21576) "tb_actions": [],
(pid=21576) "tb_tracker": {},
(pid=21576) "observation_space": "Box(4,)",
(pid=21576) "action_space": "Discrete(2)",
(pid=21576) "observable_dim": {
(pid=21576) "state": 4
(pid=21576) },
(pid=21576) "state_dim": 4,
(pid=21576) "action_dim": 2,
(pid=21576) "is_discrete": true,
(pid=21576) "action_type": "discrete",
(pid=21576) "action_pdtype": "Categorical",
(pid=21576) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21576) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7f73ed7a0320>"
(pid=21576) }
(pid=21576) - action_pdtype = default
(pid=21576) - action_policy = <function default at 0x7f740a17ff28>
(pid=21576) - center_return = False
(pid=21576) - explore_var_spec = None
(pid=21576) - entropy_coef_spec = {'end_step': 20000,
(pid=21576) 'end_val': 0.001,
(pid=21576) 'name': 'linear_decay',
(pid=21576) 'start_step': 0,
(pid=21576) 'start_val': 0.01}
(pid=21576) - policy_loss_coef = 1.0
(pid=21576) - gamma = 0.99
(pid=21576) - training_frequency = 1
(pid=21576) - to_train = 0
(pid=21576) - explore_var_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7f73ed7a02e8>
(pid=21576) - entropy_coef_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7f73ed7a03c8>
(pid=21576) - net = MLPNet(
(pid=21576) (model): Sequential(
(pid=21576) (0): Linear(in_features=4, out_features=64, bias=True)
(pid=21576) (1): SELU()
(pid=21576) )
(pid=21576) (model_tail): Sequential(
(pid=21576) (0): Linear(in_features=64, out_features=2, bias=True)
(pid=21576) )
(pid=21576) (loss_fn): MSELoss()
(pid=21576) )
(pid=21576) - net_names = ['net']
(pid=21576) - optim = Adam (
(pid=21576) Parameter Group 0
(pid=21576) amsgrad: False
(pid=21576) betas: (0.9, 0.999)
(pid=21576) eps: 1e-08
(pid=21576) lr: 0.002
(pid=21576) weight_decay: 0
(pid=21576) )
(pid=21576) - lr_scheduler = <slm_lab.agent.net.net_util.NoOpLRScheduler object at 0x7f73ed7a06a0>
(pid=21576) - global_net = None
(pid=21576) [2020-02-05 15:50:22,954 PID:21664 INFO __init__.py __init__] Agent:
(pid=21576) - spec = reinforce_baseline_cartpole
(pid=21576) - agent_spec = {'algorithm': {'action_pdtype': 'default',
(pid=21576) 'action_policy': 'default',
(pid=21576) 'center_return': False,
(pid=21576) 'entropy_coef_spec': {'end_step': 20000,
(pid=21576) 'end_val': 0.001,
(pid=21576) 'name': 'linear_decay',
(pid=21576) 'start_step': 0,
(pid=21576) 'start_val': 0.01},
(pid=21576) 'explore_var_spec': None,
(pid=21576) 'gamma': 0.99,
(pid=21576) 'name': 'Reinforce',
(pid=21576) 'training_frequency': 1},
(pid=21574) - body = body: {
(pid=21574) "agent": "<slm_lab.agent.Agent object at 0x7fd34013b390>",
(pid=21574) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>",
(pid=21574) "a": 0,
(pid=21574) "e": 0,
(pid=21574) "b": 0,
(pid=21574) "aeb": "(0, 0, 0)",
(pid=21574) "explore_var": NaN,
(pid=21574) "entropy_coef": 0.01,
(pid=21574) "loss": NaN,
(pid=21574) "mean_entropy": NaN,
(pid=21574) "mean_grad_norm": NaN,
(pid=21574) "best_total_reward_ma": -Infinity,
(pid=21574) "total_reward_ma": NaN,
(pid=21574) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7fd34011cfd0>",
(pid=21574) "tb_actions": [],
(pid=21574) "tb_tracker": {},
(pid=21574) "observation_space": "Box(4,)",
(pid=21574) "action_space": "Discrete(2)",
(pid=21574) "observable_dim": {
(pid=21574) "state": 4
(pid=21574) },
(pid=21574) "state_dim": 4,
(pid=21574) "action_dim": 2,
(pid=21574) "is_discrete": true,
(pid=21574) "action_type": "discrete",
(pid=21574) "action_pdtype": "Categorical",
(pid=21574) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21574) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7fd34013b438>"
(pid=21574) }
(pid=21574) - action_pdtype = default
(pid=21574) - action_policy = <function default at 0x7fd34cb83f28>
(pid=21574) - center_return = True
(pid=21574) - explore_var_spec = None
(pid=21574) - entropy_coef_spec = {'end_step': 20000,
(pid=21574) 'end_val': 0.001,
(pid=21574) 'name': 'linear_decay',
(pid=21574) 'start_step': 0,
(pid=21574) 'start_val': 0.01}
(pid=21574) - policy_loss_coef = 1.0
(pid=21574) - gamma = 0.99
(pid=21574) - training_frequency = 1
(pid=21574) - to_train = 0
(pid=21574) - explore_var_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7fd34013b400>
(pid=21574) - entropy_coef_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7fd34013b4e0>
(pid=21574) - net = MLPNet(
(pid=21574) (model): Sequential(
(pid=21574) (0): Linear(in_features=4, out_features=64, bias=True)
(pid=21574) (1): SELU()
(pid=21574) )
(pid=21574) (model_tail): Sequential(
(pid=21574) (0): Linear(in_features=64, out_features=2, bias=True)
(pid=21574) )
(pid=21574) (loss_fn): MSELoss()
(pid=21574) )
(pid=21574) - net_names = ['net']
(pid=21574) - optim = Adam (
(pid=21574) Parameter Group 0
(pid=21574) amsgrad: False
(pid=21574) betas: (0.9, 0.999)
(pid=21574) eps: 1e-08
(pid=21574) lr: 0.002
(pid=21574) weight_decay: 0
(pid=21574) )
(pid=21574) - lr_scheduler = <slm_lab.agent.net.net_util.NoOpLRScheduler object at 0x7fd34013b7b8>
(pid=21574) - global_net = None
(pid=21574) [2020-02-05 15:50:22,935 PID:21653 INFO __init__.py __init__][2020-02-05 15:50:22,935 PID:21649 INFO __init__.py __init__] Agent:
(pid=21574) - spec = reinforce_baseline_cartpole
(pid=21574) - agent_spec = {'algorithm': {'action_pdtype': 'default',
(pid=21574) 'action_policy': 'default',
(pid=21574) 'center_return': True,
(pid=21574) 'entropy_coef_spec': {'end_step': 20000,
(pid=21574) 'end_val': 0.001,
(pid=21574) 'name': 'linear_decay',
(pid=21574) 'start_step': 0,
(pid=21574) 'start_val': 0.01},
(pid=21574) 'explore_var_spec': None,
(pid=21574) 'gamma': 0.99,
(pid=21574) 'name': 'Reinforce',
(pid=21574) 'training_frequency': 1},
(pid=21574) 'memory': {'name': 'OnPolicyReplay'},
(pid=21574) 'name': 'Reinforce',
(pid=21574) 'net': {'clip_grad_val': None,
(pid=21574) 'hid_layers': [64],
(pid=21574) 'hid_layers_activation': 'selu',
(pid=21574) 'loss_spec': {'name': 'MSELoss'},
(pid=21574) 'lr_scheduler_spec': None,
(pid=21574) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21574) 'type': 'MLPNet'}}
(pid=21574) - name = Reinforce
(pid=21574) - body = body: {
(pid=21574) "agent": "<slm_lab.agent.Agent object at 0x7fd3401bd3c8>",
(pid=21574) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>",
(pid=21574) "a": 0,
(pid=21574) "e": 0,
(pid=21574) "b": 0,
(pid=21574) "aeb": "(0, 0, 0)",
(pid=21574) "explore_var": NaN,
(pid=21576) 'memory': {'name': 'OnPolicyReplay'},
(pid=21576) 'name': 'Reinforce',
(pid=21576) 'net': {'clip_grad_val': None,
(pid=21576) 'hid_layers': [64],
(pid=21576) 'hid_layers_activation': 'selu',
(pid=21576) 'loss_spec': {'name': 'MSELoss'},
(pid=21576) 'lr_scheduler_spec': None,
(pid=21576) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21576) 'type': 'MLPNet'}}
(pid=21576) - name = Reinforce
(pid=21576) - body = body: {
(pid=21576) "agent": "<slm_lab.agent.Agent object at 0x7f73ed7a01d0>",
(pid=21576) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>",
(pid=21576) "a": 0,
(pid=21576) "e": 0,
(pid=21576) "b": 0,
(pid=21576) "aeb": "(0, 0, 0)",
(pid=21576) "explore_var": NaN,
(pid=21576) "entropy_coef": 0.01,
(pid=21576) "loss": NaN,
(pid=21576) "mean_entropy": NaN,
(pid=21576) "mean_grad_norm": NaN,
(pid=21576) "best_total_reward_ma": -Infinity,
(pid=21576) "total_reward_ma": NaN,
(pid=21576) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7f73ed779e10>",
(pid=21576) "tb_actions": [],
(pid=21576) "tb_tracker": {},
(pid=21576) "observation_space": "Box(4,)",
(pid=21576) "action_space": "Discrete(2)",
(pid=21576) "observable_dim": {
(pid=21576) "state": 4
(pid=21576) },
(pid=21576) "state_dim": 4,
(pid=21576) "action_dim": 2,
(pid=21576) "is_discrete": true,
(pid=21576) "action_type": "discrete",
(pid=21576) "action_pdtype": "Categorical",
(pid=21576) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21576) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7f73ed7a0278>"
(pid=21576) }
(pid=21576) - algorithm = <slm_lab.agent.algorithm.reinforce.Reinforce object at 0x7f73ed7a0208>
(pid=21576) [2020-02-05 15:50:22,955 PID:21668 INFO base.py __init__] Reinforce:
(pid=21576) - agent = <slm_lab.agent.Agent object at 0x7f73ed79c320>
(pid=21576) - algorithm_spec = {'action_pdtype': 'default',
(pid=21576) 'action_policy': 'default',
(pid=21576) 'center_return': False,
(pid=21576) 'entropy_coef_spec': {'end_step': 20000,
(pid=21576) 'end_val': 0.001,
(pid=21576) 'name': 'linear_decay',
(pid=21576) 'start_step': 0,
(pid=21576) 'start_val': 0.01},
(pid=21576) 'explore_var_spec': None,
(pid=21576) 'gamma': 0.99,
(pid=21576) 'name': 'Reinforce',
(pid=21576) 'training_frequency': 1}
(pid=21576) - name = Reinforce
(pid=21576) - memory_spec = {'name': 'OnPolicyReplay'}
(pid=21576) - net_spec = {'clip_grad_val': None,
(pid=21576) 'hid_layers': [64],
(pid=21576) 'hid_layers_activation': 'selu',
(pid=21576) 'loss_spec': {'name': 'MSELoss'},
(pid=21576) 'lr_scheduler_spec': None,
(pid=21576) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21576) 'type': 'MLPNet'}
(pid=21576) - body = body: {
(pid=21576) "agent": "<slm_lab.agent.Agent object at 0x7f73ed79c320>",
(pid=21576) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>",
(pid=21576) "a": 0,
(pid=21576) "e": 0,
(pid=21576) "b": 0,
(pid=21576) "aeb": "(0, 0, 0)",
(pid=21576) "explore_var": NaN,
(pid=21576) "entropy_coef": 0.01,
(pid=21576) "loss": NaN,
(pid=21576) "mean_entropy": NaN,
(pid=21576) "mean_grad_norm": NaN,
(pid=21576) "best_total_reward_ma": -Infinity,
(pid=21576) "total_reward_ma": NaN,
(pid=21576) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7f73ed77af60>",
(pid=21576) "tb_actions": [],
(pid=21576) "tb_tracker": {},
(pid=21576) "observation_space": "Box(4,)",
(pid=21576) "action_space": "Discrete(2)",
(pid=21576) "observable_dim": {
(pid=21576) "state": 4
(pid=21576) },
(pid=21576) "state_dim": 4,
(pid=21576) "action_dim": 2,
(pid=21576) "is_discrete": true,
(pid=21576) "action_type": "discrete",
(pid=21576) "action_pdtype": "Categorical",
(pid=21576) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21576) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7f73ed79c3c8>"
(pid=21576) }
(pid=21576) - action_pdtype = default
(pid=21576) - action_policy = <function default at 0x7f740a17ff28>
(pid=21574) "entropy_coef": 0.01,
(pid=21574) "loss": NaN,
(pid=21574) "mean_entropy": NaN,
(pid=21574) "mean_grad_norm": NaN,
(pid=21574) "best_total_reward_ma": -Infinity,
(pid=21574) "total_reward_ma": NaN,
(pid=21574) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7fd34019ce80>",
(pid=21574) "tb_actions": [],
(pid=21574) "tb_tracker": {},
(pid=21574) "observation_space": "Box(4,)",
(pid=21574) "action_space": "Discrete(2)",
(pid=21574) "observable_dim": {
(pid=21574) "state": 4
(pid=21574) },
(pid=21574) "state_dim": 4,
(pid=21574) "action_dim": 2,
(pid=21574) "is_discrete": true,
(pid=21574) "action_type": "discrete",
(pid=21574) "action_pdtype": "Categorical",
(pid=21574) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21574) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7fd3401bd470>"
(pid=21574) }
(pid=21574) - algorithm = <slm_lab.agent.algorithm.reinforce.Reinforce object at 0x7fd3401bd400>
(pid=21574) Agent:
(pid=21574) - spec = reinforce_baseline_cartpole
(pid=21574) - agent_spec = {'algorithm': {'action_pdtype': 'default',
(pid=21574) 'action_policy': 'default',
(pid=21574) 'center_return': True,
(pid=21574) 'entropy_coef_spec': {'end_step': 20000,
(pid=21574) 'end_val': 0.001,
(pid=21574) 'name': 'linear_decay',
(pid=21574) 'start_step': 0,
(pid=21574) 'start_val': 0.01},
(pid=21574) 'explore_var_spec': None,
(pid=21574) 'gamma': 0.99,
(pid=21574) 'name': 'Reinforce',
(pid=21574) 'training_frequency': 1},
(pid=21574) 'memory': {'name': 'OnPolicyReplay'},
(pid=21574) 'name': 'Reinforce',
(pid=21574) 'net': {'clip_grad_val': None,
(pid=21574) 'hid_layers': [64],
(pid=21574) 'hid_layers_activation': 'selu',
(pid=21574) 'loss_spec': {'name': 'MSELoss'},
(pid=21574) 'lr_scheduler_spec': None,
(pid=21574) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21574) 'type': 'MLPNet'}}
(pid=21574) - name = Reinforce
(pid=21574) - body = body: {
(pid=21574) "agent": "<slm_lab.agent.Agent object at 0x7fd34013a438>",
(pid=21574) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>",
(pid=21574) "a": 0,
(pid=21574) "e": 0,
(pid=21574) "b": 0,
(pid=21574) "aeb": "(0, 0, 0)",
(pid=21574) "explore_var": NaN,
(pid=21574) "entropy_coef": 0.01,
(pid=21574) "loss": NaN,
(pid=21574) "mean_entropy": NaN,
(pid=21574) "mean_grad_norm": NaN,
(pid=21574) "best_total_reward_ma": -Infinity,
(pid=21574) "total_reward_ma": NaN,
(pid=21574) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7fd34011aef0>",
(pid=21574) "tb_actions": [],
(pid=21574) "tb_tracker": {},
(pid=21574) "observation_space": "Box(4,)",
(pid=21574) "action_space": "Discrete(2)",
(pid=21574) "observable_dim": {
(pid=21574) "state": 4
(pid=21574) },
(pid=21574) "state_dim": 4,
(pid=21574) "action_dim": 2,
(pid=21574) "is_discrete": true,
(pid=21574) "action_type": "discrete",
(pid=21574) "action_pdtype": "Categorical",
(pid=21574) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21574) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7fd34013a4e0>"
(pid=21574) }
(pid=21574) - algorithm = <slm_lab.agent.algorithm.reinforce.Reinforce object at 0x7fd34013a470>
(pid=21574) [2020-02-05 15:50:22,936 PID:21647 INFO __init__.py __init__][2020-02-05 15:50:22,936 PID:21651 INFO __init__.py __init__] Agent:
(pid=21574) - spec = reinforce_baseline_cartpole
(pid=21574) - agent_spec = {'algorithm': {'action_pdtype': 'default',
(pid=21574) 'action_policy': 'default',
(pid=21574) 'center_return': True,
(pid=21574) 'entropy_coef_spec': {'end_step': 20000,
(pid=21574) 'end_val': 0.001,
(pid=21574) 'name': 'linear_decay',
(pid=21574) 'start_step': 0,
(pid=21574) 'start_val': 0.01},
(pid=21574) 'explore_var_spec': None,
(pid=21574) 'gamma': 0.99,
(pid=21574) 'name': 'Reinforce',
(pid=21574) 'training_frequency': 1},
(pid=21574) 'memory': {'name': 'OnPolicyReplay'},
(pid=21574) 'name': 'Reinforce',
(pid=21574) 'net': {'clip_grad_val': None,
(pid=21574) 'hid_layers': [64],
(pid=21576) - center_return = False
(pid=21576) - explore_var_spec = None
(pid=21576) - entropy_coef_spec = {'end_step': 20000,
(pid=21576) 'end_val': 0.001,
(pid=21576) 'name': 'linear_decay',
(pid=21576) 'start_step': 0,
(pid=21576) 'start_val': 0.01}
(pid=21576) - policy_loss_coef = 1.0
(pid=21576) - gamma = 0.99
(pid=21576) - training_frequency = 1
(pid=21576) - to_train = 0
(pid=21576) - explore_var_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7f73ed79c390>
(pid=21576) - entropy_coef_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7f73ed79c470>
(pid=21576) - net = MLPNet(
(pid=21576) (model): Sequential(
(pid=21576) (0): Linear(in_features=4, out_features=64, bias=True)
(pid=21576) (1): SELU()
(pid=21576) )
(pid=21576) (model_tail): Sequential(
(pid=21576) (0): Linear(in_features=64, out_features=2, bias=True)
(pid=21576) )
(pid=21576) (loss_fn): MSELoss()
(pid=21576) )
(pid=21576) - net_names = ['net']
(pid=21576) - optim = Adam (
(pid=21576) Parameter Group 0
(pid=21576) amsgrad: False
(pid=21576) betas: (0.9, 0.999)
(pid=21576) eps: 1e-08
(pid=21576) lr: 0.002
(pid=21576) weight_decay: 0
(pid=21576) )
(pid=21576) - lr_scheduler = <slm_lab.agent.net.net_util.NoOpLRScheduler object at 0x7f73ed79c748>
(pid=21576) - global_net = None
(pid=21576) [2020-02-05 15:50:22,955 PID:21664 INFO logger.py info] Session:
(pid=21576) - spec = reinforce_baseline_cartpole
(pid=21576) - index = 0
(pid=21576) - agent = <slm_lab.agent.Agent object at 0x7f73ed7a01d0>
(pid=21576) - env = <slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>
(pid=21576) - eval_env = <slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>
(pid=21576) [2020-02-05 15:50:22,955 PID:21664 INFO logger.py info] Running RL loop for trial 1 session 0
(pid=21576) [2020-02-05 15:50:22,956 PID:21666 INFO __init__.py __init__] Agent:
(pid=21576) - spec = reinforce_baseline_cartpole
(pid=21576) - agent_spec = {'algorithm': {'action_pdtype': 'default',
(pid=21576) 'action_policy': 'default',
(pid=21576) 'center_return': False,
(pid=21576) 'entropy_coef_spec': {'end_step': 20000,
(pid=21576) 'end_val': 0.001,
(pid=21576) 'name': 'linear_decay',
(pid=21576) 'start_step': 0,
(pid=21576) 'start_val': 0.01},
(pid=21576) 'explore_var_spec': None,
(pid=21576) 'gamma': 0.99,
(pid=21576) 'name': 'Reinforce',
(pid=21576) 'training_frequency': 1},
(pid=21576) 'memory': {'name': 'OnPolicyReplay'},
(pid=21576) 'name': 'Reinforce',
(pid=21576) 'net': {'clip_grad_val': None,
(pid=21576) 'hid_layers': [64],
(pid=21576) 'hid_layers_activation': 'selu',
(pid=21576) 'loss_spec': {'name': 'MSELoss'},
(pid=21576) 'lr_scheduler_spec': None,
(pid=21576) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21576) 'type': 'MLPNet'}}
(pid=21576) - name = Reinforce
(pid=21576) - body = body: {
(pid=21576) "agent": "<slm_lab.agent.Agent object at 0x7f73ed7a0278>",
(pid=21576) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>",
(pid=21576) "a": 0,
(pid=21576) "e": 0,
(pid=21576) "b": 0,
(pid=21576) "aeb": "(0, 0, 0)",
(pid=21576) "explore_var": NaN,
(pid=21576) "entropy_coef": 0.01,
(pid=21576) "loss": NaN,
(pid=21576) "mean_entropy": NaN,
(pid=21576) "mean_grad_norm": NaN,
(pid=21576) "best_total_reward_ma": -Infinity,
(pid=21576) "total_reward_ma": NaN,
(pid=21576) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7f73ed77aeb8>",
(pid=21576) "tb_actions": [],
(pid=21576) "tb_tracker": {},
(pid=21576) "observation_space": "Box(4,)",
(pid=21576) "action_space": "Discrete(2)",
(pid=21576) "observable_dim": {
(pid=21576) "state": 4
(pid=21576) },
(pid=21576) "state_dim": 4,
(pid=21576) "action_dim": 2,
(pid=21576) "is_discrete": true,
(pid=21576) "action_type": "discrete",
(pid=21576) "action_pdtype": "Categorical",
(pid=21576) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21576) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7f73ed7a0320>"
(pid=21576) }
(pid=21576) - algorithm = <slm_lab.agent.algorithm.reinforce.Reinforce object at 0x7f73ed7a02b0>
(pid=21576) [2020-02-05 15:50:22,956 PID:21670 INFO base.py __init__][2020-02-05 15:50:22,956 PID:21666 INFO logger.py info] Session:
(pid=21576) - spec = reinforce_baseline_cartpole
(pid=21574) 'hid_layers_activation': 'selu',
(pid=21574) 'loss_spec': {'name': 'MSELoss'},
(pid=21574) 'lr_scheduler_spec': None,
(pid=21574) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21574) 'type': 'MLPNet'}}
(pid=21574) - name = Reinforce
(pid=21574) - body = body: {
(pid=21574) "agent": "<slm_lab.agent.Agent object at 0x7fd34013b390>",
(pid=21574) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>",
(pid=21574) "a": 0,
(pid=21574) "e": 0,
(pid=21574) "b": 0,
(pid=21574) "aeb": "(0, 0, 0)",
(pid=21574) "explore_var": NaN,
(pid=21574) "entropy_coef": 0.01,
(pid=21574) "loss": NaN,
(pid=21574) "mean_entropy": NaN,
(pid=21574) "mean_grad_norm": NaN,
(pid=21574) "best_total_reward_ma": -Infinity,
(pid=21574) "total_reward_ma": NaN,
(pid=21574) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7fd34011cfd0>",
(pid=21574) "tb_actions": [],
(pid=21574) "tb_tracker": {},
(pid=21574) "observation_space": "Box(4,)",
(pid=21574) "action_space": "Discrete(2)",
(pid=21574) "observable_dim": {
(pid=21574) "state": 4
(pid=21574) },
(pid=21574) "state_dim": 4,
(pid=21574) "action_dim": 2,
(pid=21574) "is_discrete": true,
(pid=21574) "action_type": "discrete",
(pid=21574) "action_pdtype": "Categorical",
(pid=21574) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21574) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7fd34013b438>"
(pid=21574) }
(pid=21574) - algorithm = <slm_lab.agent.algorithm.reinforce.Reinforce object at 0x7fd34013b3c8>
(pid=21574) Agent:
(pid=21574) - spec = reinforce_baseline_cartpole
(pid=21574) - agent_spec = {'algorithm': {'action_pdtype': 'default',
(pid=21574) 'action_policy': 'default',
(pid=21574) 'center_return': True,
(pid=21574) 'entropy_coef_spec': {'end_step': 20000,
(pid=21574) 'end_val': 0.001,
(pid=21574) 'name': 'linear_decay',
(pid=21574) 'start_step': 0,
(pid=21574) 'start_val': 0.01},
(pid=21574) 'explore_var_spec': None,
(pid=21574) 'gamma': 0.99,
(pid=21574) 'name': 'Reinforce',
(pid=21574) 'training_frequency': 1},
(pid=21574) 'memory': {'name': 'OnPolicyReplay'},
(pid=21574) 'name': 'Reinforce',
(pid=21574) 'net': {'clip_grad_val': None,
(pid=21574) 'hid_layers': [64],
(pid=21574) 'hid_layers_activation': 'selu',
(pid=21574) 'loss_spec': {'name': 'MSELoss'},
(pid=21574) 'lr_scheduler_spec': None,
(pid=21574) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21574) 'type': 'MLPNet'}}
(pid=21574) - name = Reinforce
(pid=21574) - body = body: {
(pid=21574) "agent": "<slm_lab.agent.Agent object at 0x7fd34013b400>",
(pid=21574) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>",
(pid=21574) "a": 0,
(pid=21574) "e": 0,
(pid=21574) "b": 0,
(pid=21574) "aeb": "(0, 0, 0)",
(pid=21574) "explore_var": NaN,
(pid=21574) "entropy_coef": 0.01,
(pid=21574) "loss": NaN,
(pid=21574) "mean_entropy": NaN,
(pid=21574) "mean_grad_norm": NaN,
(pid=21574) "best_total_reward_ma": -Infinity,
(pid=21574) "total_reward_ma": NaN,
(pid=21574) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21574) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7fd34011beb8>",
(pid=21574) "tb_actions": [],
(pid=21574) "tb_tracker": {},
(pid=21574) "observation_space": "Box(4,)",
(pid=21574) "action_space": "Discrete(2)",
(pid=21574) "observable_dim": {
(pid=21574) "state": 4
(pid=21574) },
(pid=21574) "state_dim": 4,
(pid=21574) "action_dim": 2,
(pid=21574) "is_discrete": true,
(pid=21574) "action_type": "discrete",
(pid=21574) "action_pdtype": "Categorical",
(pid=21574) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21574) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7fd34013b4a8>"
(pid=21574) }
(pid=21574) - algorithm = <slm_lab.agent.algorithm.reinforce.Reinforce object at 0x7fd34013b438>
(pid=21574) [2020-02-05 15:50:22,936 PID:21653 INFO logger.py info] Session:
(pid=21574) - spec = reinforce_baseline_cartpole
(pid=21574) - index = 3
(pid=21574) - agent = <slm_lab.agent.Agent object at 0x7fd3401bd3c8>
(pid=21576) - index = 1
(pid=21576) - agent = <slm_lab.agent.Agent object at 0x7f73ed7a0278>
(pid=21576) - env = <slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>
(pid=21576) - eval_env = <slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>
(pid=21576) Reinforce:
(pid=21576) - agent = <slm_lab.agent.Agent object at 0x7f73ed79c3c8>
(pid=21576) - algorithm_spec = {'action_pdtype': 'default',
(pid=21576) 'action_policy': 'default',
(pid=21576) 'center_return': False,
(pid=21576) 'entropy_coef_spec': {'end_step': 20000,
(pid=21576) 'end_val': 0.001,
(pid=21576) 'name': 'linear_decay',
(pid=21576) 'start_step': 0,
(pid=21576) 'start_val': 0.01},
(pid=21576) 'explore_var_spec': None,
(pid=21576) 'gamma': 0.99,
(pid=21576) 'name': 'Reinforce',
(pid=21576) 'training_frequency': 1}
(pid=21576) - name = Reinforce
(pid=21576) - memory_spec = {'name': 'OnPolicyReplay'}
(pid=21576) - net_spec = {'clip_grad_val': None,
(pid=21576) 'hid_layers': [64],
(pid=21576) 'hid_layers_activation': 'selu',
(pid=21576) 'loss_spec': {'name': 'MSELoss'},
(pid=21576) 'lr_scheduler_spec': None,
(pid=21576) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21576) 'type': 'MLPNet'}
(pid=21576) - body = body: {
(pid=21576) "agent": "<slm_lab.agent.Agent object at 0x7f73ed79c3c8>",
(pid=21576) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>",
(pid=21576) "a": 0,
(pid=21576) "e": 0,
(pid=21576) "b": 0,
(pid=21576) "aeb": "(0, 0, 0)",
(pid=21576) "explore_var": NaN,
(pid=21576) "entropy_coef": 0.01,
(pid=21576) "loss": NaN,
(pid=21576) "mean_entropy": NaN,
(pid=21576) "mean_grad_norm": NaN,
(pid=21576) "best_total_reward_ma": -Infinity,
(pid=21576) "total_reward_ma": NaN,
(pid=21576) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7f73ed77be80>",
(pid=21576) "tb_actions": [],
(pid=21576) "tb_tracker": {},
(pid=21576) "observation_space": "Box(4,)",
(pid=21576) "action_space": "Discrete(2)",
(pid=21576) "observable_dim": {
(pid=21576) "state": 4
(pid=21576) },
(pid=21576) "state_dim": 4,
(pid=21576) "action_dim": 2,
(pid=21576) "is_discrete": true,
(pid=21576) "action_type": "discrete",
(pid=21576) "action_pdtype": "Categorical",
(pid=21576) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21576) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7f73ed79c470>"
(pid=21576) }
(pid=21576) - action_pdtype = default
(pid=21576) - action_policy = <function default at 0x7f740a17ff28>
(pid=21576) - center_return = False
(pid=21576) - explore_var_spec = None
(pid=21576) - entropy_coef_spec = {'end_step': 20000,
(pid=21576) 'end_val': 0.001,
(pid=21576) 'name': 'linear_decay',
(pid=21576) 'start_step': 0,
(pid=21576) 'start_val': 0.01}
(pid=21576) - policy_loss_coef = 1.0
(pid=21576) - gamma = 0.99
(pid=21576) - training_frequency = 1
(pid=21576) - to_train = 0
(pid=21576) - explore_var_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7f73ed79c438>
(pid=21576) - entropy_coef_scheduler = <slm_lab.agent.algorithm.policy_util.VarScheduler object at 0x7f73ed79c518>
(pid=21576) - net = MLPNet(
(pid=21576) (model): Sequential(
(pid=21576) (0): Linear(in_features=4, out_features=64, bias=True)
(pid=21576) (1): SELU()
(pid=21576) )
(pid=21576) (model_tail): Sequential(
(pid=21576) (0): Linear(in_features=64, out_features=2, bias=True)
(pid=21576) )
(pid=21576) (loss_fn): MSELoss()
(pid=21576) )
(pid=21576) - net_names = ['net']
(pid=21576) - optim = Adam (
(pid=21576) Parameter Group 0
(pid=21576) amsgrad: False
(pid=21576) betas: (0.9, 0.999)
(pid=21576) eps: 1e-08
(pid=21576) lr: 0.002
(pid=21576) weight_decay: 0
(pid=21576) )
(pid=21576) - lr_scheduler = <slm_lab.agent.net.net_util.NoOpLRScheduler object at 0x7f73ed79c7f0>
(pid=21576) - global_net = None
(pid=21576) [2020-02-05 15:50:22,956 PID:21666 INFO logger.py info] Running RL loop for trial 1 session 1
(pid=21576) [2020-02-05 15:50:22,956 PID:21668 INFO __init__.py __init__] Agent:
(pid=21576) - spec = reinforce_baseline_cartpole
(pid=21576) - agent_spec = {'algorithm': {'action_pdtype': 'default',
(pid=21576) 'action_policy': 'default',
(pid=21574) - env = <slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>
(pid=21574) - eval_env = <slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>
(pid=21574) [2020-02-05 15:50:22,936 PID:21649 INFO logger.py info] Session:
(pid=21574) - spec = reinforce_baseline_cartpole
(pid=21574) - index = 1
(pid=21574) - agent = <slm_lab.agent.Agent object at 0x7fd34013a438>
(pid=21574) - env = <slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>
(pid=21574) - eval_env = <slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>
(pid=21574) [2020-02-05 15:50:22,936 PID:21653 INFO logger.py info] Running RL loop for trial 0 session 3
(pid=21574) [2020-02-05 15:50:22,936 PID:21647 INFO logger.py info] Session:
(pid=21574) - spec = reinforce_baseline_cartpole
(pid=21574) - index = 0
(pid=21574) - agent = <slm_lab.agent.Agent object at 0x7fd34013b390>
(pid=21574) - env = <slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>
(pid=21574) - eval_env = <slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>
(pid=21574) [2020-02-05 15:50:22,936 PID:21651 INFO logger.py info][2020-02-05 15:50:22,936 PID:21649 INFO logger.py info] Session:
(pid=21574) - spec = reinforce_baseline_cartpole
(pid=21574) - index = 2
(pid=21574) - agent = <slm_lab.agent.Agent object at 0x7fd34013b400>
(pid=21574) - env = <slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320>
(pid=21574) - eval_env = <slm_lab.env.openai.OpenAIEnv object at 0x7fd478962320> Running RL loop for trial 0 session 1
(pid=21574)
(pid=21574) [2020-02-05 15:50:22,936 PID:21647 INFO logger.py info] Running RL loop for trial 0 session 0
(pid=21574) [2020-02-05 15:50:22,936 PID:21651 INFO logger.py info] Running RL loop for trial 0 session 2
(pid=21574) [2020-02-05 15:50:22,941 PID:21651 INFO __init__.py log_summary] Trial 0 session 2 reinforce_baseline_cartpole_t0_s2 [train_df] epi: 0 t: 0 wall_t: 0 opt_step: 0 frame: 0 fps: 0 total_reward: nan total_reward_ma: nan loss: nan lr: 0.002 explore_var: nan entropy_coef: 0.01 entropy: nan grad_norm: nan
(pid=21574) [2020-02-05 15:50:22,942 PID:21653 INFO __init__.py log_summary] Trial 0 session 3 reinforce_baseline_cartpole_t0_s3 [train_df] epi: 0 t: 0 wall_t: 0 opt_step: 0 frame: 0 fps: 0 total_reward: nan total_reward_ma: nan loss: nan lr: 0.002 explore_var: nan entropy_coef: 0.01 entropy: nan grad_norm: nan
(pid=21574) [2020-02-05 15:50:22,942 PID:21649 INFO __init__.py log_summary] Trial 0 session 1 reinforce_baseline_cartpole_t0_s1 [train_df] epi: 0 t: 0 wall_t: 0 opt_step: 0 frame: 0 fps: 0 total_reward: nan total_reward_ma: nan loss: nan lr: 0.002 explore_var: nan entropy_coef: 0.01 entropy: nan grad_norm: nan
(pid=21574) [2020-02-05 15:50:22,942 PID:21647 INFO __init__.py log_summary] Trial 0 session 0 reinforce_baseline_cartpole_t0_s0 [train_df] epi: 0 t: 0 wall_t: 0 opt_step: 0 frame: 0 fps: 0 total_reward: nan total_reward_ma: nan loss: nan lr: 0.002 explore_var: nan entropy_coef: 0.01 entropy: nan grad_norm: nan
(pid=21576) 'center_return': False,
(pid=21576) 'entropy_coef_spec': {'end_step': 20000,
(pid=21576) 'end_val': 0.001,
(pid=21576) 'name': 'linear_decay',
(pid=21576) 'start_step': 0,
(pid=21576) 'start_val': 0.01},
(pid=21576) 'explore_var_spec': None,
(pid=21576) 'gamma': 0.99,
(pid=21576) 'name': 'Reinforce',
(pid=21576) 'training_frequency': 1},
(pid=21576) 'memory': {'name': 'OnPolicyReplay'},
(pid=21576) 'name': 'Reinforce',
(pid=21576) 'net': {'clip_grad_val': None,
(pid=21576) 'hid_layers': [64],
(pid=21576) 'hid_layers_activation': 'selu',
(pid=21576) 'loss_spec': {'name': 'MSELoss'},
(pid=21576) 'lr_scheduler_spec': None,
(pid=21576) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21576) 'type': 'MLPNet'}}
(pid=21576) - name = Reinforce
(pid=21576) - body = body: {
(pid=21576) "agent": "<slm_lab.agent.Agent object at 0x7f73ed79c320>",
(pid=21576) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>",
(pid=21576) "a": 0,
(pid=21576) "e": 0,
(pid=21576) "b": 0,
(pid=21576) "aeb": "(0, 0, 0)",
(pid=21576) "explore_var": NaN,
(pid=21576) "entropy_coef": 0.01,
(pid=21576) "loss": NaN,
(pid=21576) "mean_entropy": NaN,
(pid=21576) "mean_grad_norm": NaN,
(pid=21576) "best_total_reward_ma": -Infinity,
(pid=21576) "total_reward_ma": NaN,
(pid=21576) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7f73ed77af60>",
(pid=21576) "tb_actions": [],
(pid=21576) "tb_tracker": {},
(pid=21576) "observation_space": "Box(4,)",
(pid=21576) "action_space": "Discrete(2)",
(pid=21576) "observable_dim": {
(pid=21576) "state": 4
(pid=21576) },
(pid=21576) "state_dim": 4,
(pid=21576) "action_dim": 2,
(pid=21576) "is_discrete": true,
(pid=21576) "action_type": "discrete",
(pid=21576) "action_pdtype": "Categorical",
(pid=21576) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21576) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7f73ed79c3c8>"
(pid=21576) }
(pid=21576) - algorithm = <slm_lab.agent.algorithm.reinforce.Reinforce object at 0x7f73ed79c358>
(pid=21576) [2020-02-05 15:50:22,957 PID:21668 INFO logger.py info] Session:
(pid=21576) - spec = reinforce_baseline_cartpole
(pid=21576) - index = 2
(pid=21576) - agent = <slm_lab.agent.Agent object at 0x7f73ed79c320>
(pid=21576) - env = <slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>
(pid=21576) - eval_env = <slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>
(pid=21576) [2020-02-05 15:50:22,957 PID:21668 INFO logger.py info] Running RL loop for trial 1 session 2
(pid=21576) [2020-02-05 15:50:22,958 PID:21670 INFO __init__.py __init__] Agent:
(pid=21576) - spec = reinforce_baseline_cartpole
(pid=21576) - agent_spec = {'algorithm': {'action_pdtype': 'default',
(pid=21576) 'action_policy': 'default',
(pid=21576) 'center_return': False,
(pid=21576) 'entropy_coef_spec': {'end_step': 20000,
(pid=21576) 'end_val': 0.001,
(pid=21576) 'name': 'linear_decay',
(pid=21576) 'start_step': 0,
(pid=21576) 'start_val': 0.01},
(pid=21576) 'explore_var_spec': None,
(pid=21576) 'gamma': 0.99,
(pid=21576) 'name': 'Reinforce',
(pid=21576) 'training_frequency': 1},
(pid=21576) 'memory': {'name': 'OnPolicyReplay'},
(pid=21576) 'name': 'Reinforce',
(pid=21576) 'net': {'clip_grad_val': None,
(pid=21576) 'hid_layers': [64],
(pid=21576) 'hid_layers_activation': 'selu',
(pid=21576) 'loss_spec': {'name': 'MSELoss'},
(pid=21576) 'lr_scheduler_spec': None,
(pid=21576) 'optim_spec': {'lr': 0.002, 'name': 'Adam'},
(pid=21576) 'type': 'MLPNet'}}
(pid=21576) - name = Reinforce
(pid=21576) - body = body: {
(pid=21576) "agent": "<slm_lab.agent.Agent object at 0x7f73ed79c3c8>",
(pid=21576) "env": "<slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>",
(pid=21576) "a": 0,
(pid=21576) "e": 0,
(pid=21576) "b": 0,
(pid=21576) "aeb": "(0, 0, 0)",
(pid=21576) "explore_var": NaN,
(pid=21576) "entropy_coef": 0.01,
(pid=21576) "loss": NaN,
(pid=21576) "mean_entropy": NaN,
(pid=21576) "mean_grad_norm": NaN,
(pid=21576) "best_total_reward_ma": -Infinity,
(pid=21576) "total_reward_ma": NaN,
(pid=21576) "train_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "eval_df": "Empty DataFrame\nColumns: [epi, t, wall_t, opt_step, frame, fps, total_reward, total_reward_ma, loss, lr, explore_var, entropy_coef, entropy, grad_norm]\nIndex: []",
(pid=21576) "tb_writer": "<torch.utils.tensorboard.writer.SummaryWriter object at 0x7f73ed77be80>",
(pid=21576) "tb_actions": [],
(pid=21576) "tb_tracker": {},
(pid=21576) "observation_space": "Box(4,)",
(pid=21576) "action_space": "Discrete(2)",
(pid=21576) "observable_dim": {
(pid=21576) "state": 4
(pid=21576) },
(pid=21576) "state_dim": 4,
(pid=21576) "action_dim": 2,
(pid=21576) "is_discrete": true,
(pid=21576) "action_type": "discrete",
(pid=21576) "action_pdtype": "Categorical",
(pid=21576) "ActionPD": "<class 'torch.distributions.categorical.Categorical'>",
(pid=21576) "memory": "<slm_lab.agent.memory.onpolicy.OnPolicyReplay object at 0x7f73ed79c470>"
(pid=21576) }
(pid=21576) - algorithm = <slm_lab.agent.algorithm.reinforce.Reinforce object at 0x7f73ed79c400>
(pid=21576) [2020-02-05 15:50:22,958 PID:21670 INFO logger.py info] Session:
(pid=21576) - spec = reinforce_baseline_cartpole
(pid=21576) - index = 3
(pid=21576) - agent = <slm_lab.agent.Agent object at 0x7f73ed79c3c8>
(pid=21576) - env = <slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>
(pid=21576) - eval_env = <slm_lab.env.openai.OpenAIEnv object at 0x7f7535f602e8>
(pid=21576) [2020-02-05 15:50:22,958 PID:21670 INFO logger.py info] Running RL loop for trial 1 session 3
(pid=21576) [2020-02-05 15:50:22,958 PID:21664 INFO __init__.py log_summary] Trial 1 session 0 reinforce_baseline_cartpole_t1_s0 [train_df] epi: 0 t: 0 wall_t: 0 opt_step: 0 frame: 0 fps: 0 total_reward: nan total_reward_ma: nan loss: nan lr: 0.002 explore_var: nan entropy_coef: 0.01 entropy: nan grad_norm: nan
(pid=21576) [2020-02-05 15:50:22,961 PID:21666 INFO __init__.py log_summary] Trial 1 session 1 reinforce_baseline_cartpole_t1_s1 [train_df] epi: 0 t: 0 wall_t: 0 opt_step: 0 frame: 0 fps: 0 total_reward: nan total_reward_ma: nan loss: nan lr: 0.002 explore_var: nan entropy_coef: 0.01 entropy: nan grad_norm: nan
(pid=21576) [2020-02-05 15:50:22,962 PID:21668 INFO __init__.py log_summary] Trial 1 session 2 reinforce_baseline_cartpole_t1_s2 [train_df] epi: 0 t: 0 wall_t: 0 opt_step: 0 frame: 0 fps: 0 total_reward: nan total_reward_ma: nan loss: nan lr: 0.002 explore_var: nan entropy_coef: 0.01 entropy: nan grad_norm: nan
(pid=21576) [2020-02-05 15:50:22,962 PID:21670 INFO __init__.py log_summary] Trial 1 session 3 reinforce_baseline_cartpole_t1_s3 [train_df] epi: 0 t: 0 wall_t: 0 opt_step: 0 frame: 0 fps: 0 total_reward: nan total_reward_ma: nan loss: nan lr: 0.002 explore_var: nan entropy_coef: 0.01 entropy: nan grad_norm: nan
(pid=21574) terminate called after throwing an instance of 'c10::Error'
(pid=21574) what(): CUDA error: initialization error (getDevice at /opt/conda/conda-bld/pytorch_1556653099582/work/c10/cuda/impl/CUDAGuardImpl.h:35)
(pid=21574) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7fd49375cdc5 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10.so)
(pid=21574) frame #1: <unknown function> + 0xca67 (0x7fd48b958a67 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
(pid=21574) frame #2: torch::autograd::Engine::thread_init(int) + 0x3ee (0x7fd48c079b1e in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
(pid=21574) frame #3: torch::autograd::python::PythonEngine::thread_init(int) + 0x2a (0x7fd4c29ded1a in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
(pid=21574) frame #4: <unknown function> + 0xc8421 (0x7fd4d7ac7421 in /home/joe/anaconda3/envs/lab2/bin/../lib/libstdc++.so.6)
(pid=21574) frame #5: <unknown function> + 0x76db (0x7fd4dd3326db in /lib/x86_64-linux-gnu/libpthread.so.0)
(pid=21574) frame #6: clone + 0x3f (0x7fd4dd05b88f in /lib/x86_64-linux-gnu/libc.so.6)
(pid=21574)
(pid=21574) Fatal Python error: Aborted
(pid=21574)
(pid=21574) Stack (most recent call first):
(pid=21574) terminate called after throwing an instance of 'c10::Error'
(pid=21574) what(): CUDA error: initialization error (getDevice at /opt/conda/conda-bld/pytorch_1556653099582/work/c10/cuda/impl/CUDAGuardImpl.h:35)
(pid=21574) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7fd49375cdc5 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10.so)
(pid=21574) frame #1: <unknown function> + 0xca67 (0x7fd48b958a67 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
(pid=21574) frame #2: torch::autograd::Engine::thread_init(int) + 0x3ee (0x7fd48c079b1e in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
(pid=21574) frame #3: torch::autograd::python::PythonEngine::thread_init(int) + 0x2a (0x7fd4c29ded1a in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
(pid=21574) frame #4: <unknown function> + 0xc8421 (0x7fd4d7ac7421 in /home/joe/anaconda3/envs/lab2/bin/../lib/libstdc++.so.6)
(pid=21574) frame #5: <unknown function> + 0x76db (0x7fd4dd3326db in /lib/x86_64-linux-gnu/libpthread.so.0)
(pid=21574) frame #6: clone + 0x3f (0x7fd4dd05b88f in /lib/x86_64-linux-gnu/libc.so.6)
(pid=21574)
(pid=21574) Fatal Python error: Aborted
(pid=21574)
(pid=21574) Stack (most recent call first):
(pid=21574) terminate called after throwing an instance of 'c10::Error'
(pid=21574) what(): CUDA error: initialization error (getDevice at /opt/conda/conda-bld/pytorch_1556653099582/work/c10/cuda/impl/CUDAGuardImpl.h:35)
(pid=21574) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7fd49375cdc5 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10.so)
(pid=21574) frame #1: <unknown function> + 0xca67 (0x7fd48b958a67 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
(pid=21574) frame #2: torch::autograd::Engine::thread_init(int) + 0x3ee (0x7fd48c079b1e in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
(pid=21574) frame #3: torch::autograd::python::PythonEngine::thread_init(int) + 0x2a (0x7fd4c29ded1a in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
(pid=21574) frame #4: <unknown function> + 0xc8421 (0x7fd4d7ac7421 in /home/joe/anaconda3/envs/lab2/bin/../lib/libstdc++.so.6)
(pid=21574) frame #5: <unknown function> + 0x76db (0x7fd4dd3326db in /lib/x86_64-linux-gnu/libpthread.so.0)
(pid=21574) frame #6: clone + 0x3f (0x7fd4dd05b88f in /lib/x86_64-linux-gnu/libc.so.6)
(pid=21574)
(pid=21574) Fatal Python error: Aborted
(pid=21574)
(pid=21574) Stack (most recent call first):
(pid=21574) terminate called after throwing an instance of 'c10::Error'
(pid=21574) what(): CUDA error: initialization error (getDevice at /opt/conda/conda-bld/pytorch_1556653099582/work/c10/cuda/impl/CUDAGuardImpl.h:35)
(pid=21574) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7fd49375cdc5 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10.so)
(pid=21574) frame #1: <unknown function> + 0xca67 (0x7fd48b958a67 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
(pid=21574) frame #2: torch::autograd::Engine::thread_init(int) + 0x3ee (0x7fd48c079b1e in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
(pid=21574) frame #3: torch::autograd::python::PythonEngine::thread_init(int) + 0x2a (0x7fd4c29ded1a in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
(pid=21574) frame #4: <unknown function> + 0xc8421 (0x7fd4d7ac7421 in /home/joe/anaconda3/envs/lab2/bin/../lib/libstdc++.so.6)
(pid=21574) frame #5: <unknown function> + 0x76db (0x7fd4dd3326db in /lib/x86_64-linux-gnu/libpthread.so.0)
(pid=21574) frame #6: clone + 0x3f (0x7fd4dd05b88f in /lib/x86_64-linux-gnu/libc.so.6)
(pid=21574)
(pid=21574) Fatal Python error: Aborted
(pid=21574)
(pid=21574) Stack (most recent call first):
(pid=21576) terminate called after throwing an instance of 'c10::Error'
(pid=21576) what(): CUDA error: initialization error (getDevice at /opt/conda/conda-bld/pytorch_1556653099582/work/c10/cuda/impl/CUDAGuardImpl.h:35)
(pid=21576) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f7550d59dc5 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10.so)
(pid=21576) frame #1: <unknown function> + 0xca67 (0x7f7548f55a67 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
(pid=21576) frame #2: torch::autograd::Engine::thread_init(int) + 0x3ee (0x7f7549676b1e in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
(pid=21576) frame #3: torch::autograd::python::PythonEngine::thread_init(int) + 0x2a (0x7f757ffdbd1a in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
(pid=21576) frame #4: <unknown function> + 0xc8421 (0x7f75950c4421 in /home/joe/anaconda3/envs/lab2/bin/../lib/libstdc++.so.6)
(pid=21576) frame #5: <unknown function> + 0x76db (0x7f759a92f6db in /lib/x86_64-linux-gnu/libpthread.so.0)
(pid=21576) frame #6: clone + 0x3f (0x7f759a65888f in /lib/x86_64-linux-gnu/libc.so.6)
(pid=21576)
(pid=21576) Fatal Python error: Aborted
(pid=21576)
(pid=21576) Stack (most recent call first):
(pid=21576) terminate called after throwing an instance of 'c10::Error'
(pid=21576) what(): CUDA error: initialization error (getDevice at /opt/conda/conda-bld/pytorch_1556653099582/work/c10/cuda/impl/CUDAGuardImpl.h:35)
(pid=21576) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f7550d59dc5 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10.so)
(pid=21576) frame #1: <unknown function> + 0xca67 (0x7f7548f55a67 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
(pid=21576) frame #2: torch::autograd::Engine::thread_init(int) + 0x3ee (0x7f7549676b1e in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
(pid=21576) frame #3: torch::autograd::python::PythonEngine::thread_init(int) + 0x2a (0x7f757ffdbd1a in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
(pid=21576) frame #4: <unknown function> + 0xc8421 (0x7f75950c4421 in /home/joe/anaconda3/envs/lab2/bin/../lib/libstdc++.so.6)
(pid=21576) frame #5: <unknown function> + 0x76db (0x7f759a92f6db in /lib/x86_64-linux-gnu/libpthread.so.0)
(pid=21576) frame #6: clone + 0x3f (0x7f759a65888f in /lib/x86_64-linux-gnu/libc.so.6)
(pid=21576)
(pid=21576) Fatal Python error: Aborted
(pid=21576)
(pid=21576) Stack (most recent call first):
(pid=21576) terminate called after throwing an instance of 'c10::Error'
(pid=21576) what(): CUDA error: initialization error (getDevice at /opt/conda/conda-bld/pytorch_1556653099582/work/c10/cuda/impl/CUDAGuardImpl.h:35)
(pid=21576) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f7550d59dc5 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10.so)
(pid=21576) frame #1: <unknown function> + 0xca67 (0x7f7548f55a67 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
(pid=21576) frame #2: torch::autograd::Engine::thread_init(int) + 0x3ee (0x7f7549676b1e in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
(pid=21576) frame #3: torch::autograd::python::PythonEngine::thread_init(int) + 0x2a (0x7f757ffdbd1a in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
(pid=21576) frame #4: <unknown function> + 0xc8421 (0x7f75950c4421 in /home/joe/anaconda3/envs/lab2/bin/../lib/libstdc++.so.6)
(pid=21576) frame #5: <unknown function> + 0x76db (0x7f759a92f6db in /lib/x86_64-linux-gnu/libpthread.so.0)
(pid=21576) frame #6: clone + 0x3f (0x7f759a65888f in /lib/x86_64-linux-gnu/libc.so.6)
(pid=21576)
(pid=21576) Fatal Python error: Aborted
(pid=21576)
(pid=21576) Stack (most recent call first):
(pid=21576) terminate called after throwing an instance of 'c10::Error'
(pid=21576) what(): CUDA error: initialization error (getDevice at /opt/conda/conda-bld/pytorch_1556653099582/work/c10/cuda/impl/CUDAGuardImpl.h:35)
(pid=21576) frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f7550d59dc5 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10.so)
(pid=21576) frame #1: <unknown function> + 0xca67 (0x7f7548f55a67 in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
(pid=21576) frame #2: torch::autograd::Engine::thread_init(int) + 0x3ee (0x7f7549676b1e in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
(pid=21576) frame #3: torch::autograd::python::PythonEngine::thread_init(int) + 0x2a (0x7f757ffdbd1a in /home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
(pid=21576) frame #4: <unknown function> + 0xc8421 (0x7f75950c4421 in /home/joe/anaconda3/envs/lab2/bin/../lib/libstdc++.so.6)
(pid=21576) frame #5: <unknown function> + 0x76db (0x7f759a92f6db in /lib/x86_64-linux-gnu/libpthread.so.0)
(pid=21576) frame #6: clone + 0x3f (0x7f759a65888f in /lib/x86_64-linux-gnu/libc.so.6)
(pid=21576)
(pid=21576) Fatal Python error: Aborted
(pid=21576)
(pid=21576) Stack (most recent call first):
(pid=21574) 2020-02-05 15:50:23,211 ERROR function_runner.py:96 -- Runner Thread raised error.
(pid=21574) Traceback (most recent call last):
(pid=21574) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 90, in run
(pid=21574) self._entrypoint()
(pid=21574) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 141, in entrypoint
(pid=21574) return self._trainable_func(config, self._status_reporter)
(pid=21574) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 249, in _trainable_func
(pid=21574) output = train_func(config, reporter)
(pid=21574) File "/home/joe/SLM-Lab/slm_lab/experiment/search.py", line 90, in ray_trainable
(pid=21574) metrics = Trial(spec).run()
(pid=21574) File "/home/joe/SLM-Lab/slm_lab/experiment/control.py", line 181, in run
(pid=21574) metrics = analysis.analyze_trial(self.spec, session_metrics_list)
(pid=21574) File "/home/joe/SLM-Lab/slm_lab/experiment/analysis.py", line 265, in analyze_trial
(pid=21574) trial_metrics = calc_trial_metrics(session_metrics_list, info_prepath)
(pid=21574) File "/home/joe/SLM-Lab/slm_lab/experiment/analysis.py", line 187, in calc_trial_metrics
(pid=21574) frames = session_metrics_list[0]['local']['frames']
(pid=21574) IndexError: list index out of range
(pid=21574) Exception in thread Thread-1:
(pid=21574) Traceback (most recent call last):
(pid=21574) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 90, in run
(pid=21574) self._entrypoint()
(pid=21574) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 141, in entrypoint
(pid=21574) return self._trainable_func(config, self._status_reporter)
(pid=21574) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 249, in _trainable_func
(pid=21574) output = train_func(config, reporter)
(pid=21574) File "/home/joe/SLM-Lab/slm_lab/experiment/search.py", line 90, in ray_trainable
(pid=21574) metrics = Trial(spec).run()
(pid=21574) File "/home/joe/SLM-Lab/slm_lab/experiment/control.py", line 181, in run
(pid=21574) metrics = analysis.analyze_trial(self.spec, session_metrics_list)
(pid=21574) File "/home/joe/SLM-Lab/slm_lab/experiment/analysis.py", line 265, in analyze_trial
(pid=21574) trial_metrics = calc_trial_metrics(session_metrics_list, info_prepath)
(pid=21574) File "/home/joe/SLM-Lab/slm_lab/experiment/analysis.py", line 187, in calc_trial_metrics
(pid=21574) frames = session_metrics_list[0]['local']['frames']
(pid=21574) IndexError: list index out of range
(pid=21574)
(pid=21574) During handling of the above exception, another exception occurred:
(pid=21574)
(pid=21574) Traceback (most recent call last):
(pid=21574) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/threading.py", line 916, in _bootstrap_inner
(pid=21574) self.run()
(pid=21574) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 102, in run
(pid=21574) err_tb = err_tb.format_exc()
(pid=21574) AttributeError: 'traceback' object has no attribute 'format_exc'
(pid=21574)
(pid=21576) 2020-02-05 15:50:23,190 ERROR function_runner.py:96 -- Runner Thread raised error.
(pid=21576) Traceback (most recent call last):
(pid=21576) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 90, in run
(pid=21576) self._entrypoint()
(pid=21576) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 141, in entrypoint
(pid=21576) return self._trainable_func(config, self._status_reporter)
(pid=21576) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 249, in _trainable_func
(pid=21576) output = train_func(config, reporter)
(pid=21576) File "/home/joe/SLM-Lab/slm_lab/experiment/search.py", line 90, in ray_trainable
(pid=21576) metrics = Trial(spec).run()
(pid=21576) File "/home/joe/SLM-Lab/slm_lab/experiment/control.py", line 181, in run
(pid=21576) metrics = analysis.analyze_trial(self.spec, session_metrics_list)
(pid=21576) File "/home/joe/SLM-Lab/slm_lab/experiment/analysis.py", line 265, in analyze_trial
(pid=21576) trial_metrics = calc_trial_metrics(session_metrics_list, info_prepath)
(pid=21576) File "/home/joe/SLM-Lab/slm_lab/experiment/analysis.py", line 187, in calc_trial_metrics
(pid=21576) frames = session_metrics_list[0]['local']['frames']
(pid=21576) IndexError: list index out of range
(pid=21576) Exception in thread Thread-1:
(pid=21576) Traceback (most recent call last):
(pid=21576) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 90, in run
(pid=21576) self._entrypoint()
(pid=21576) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 141, in entrypoint
(pid=21576) return self._trainable_func(config, self._status_reporter)
(pid=21576) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 249, in _trainable_func
(pid=21576) output = train_func(config, reporter)
(pid=21576) File "/home/joe/SLM-Lab/slm_lab/experiment/search.py", line 90, in ray_trainable
(pid=21576) metrics = Trial(spec).run()
(pid=21576) File "/home/joe/SLM-Lab/slm_lab/experiment/control.py", line 181, in run
(pid=21576) metrics = analysis.analyze_trial(self.spec, session_metrics_list)
(pid=21576) File "/home/joe/SLM-Lab/slm_lab/experiment/analysis.py", line 265, in analyze_trial
(pid=21576) trial_metrics = calc_trial_metrics(session_metrics_list, info_prepath)
(pid=21576) File "/home/joe/SLM-Lab/slm_lab/experiment/analysis.py", line 187, in calc_trial_metrics
(pid=21576) frames = session_metrics_list[0]['local']['frames']
(pid=21576) IndexError: list index out of range
(pid=21576)
(pid=21576) During handling of the above exception, another exception occurred:
(pid=21576)
(pid=21576) Traceback (most recent call last):
(pid=21576) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/threading.py", line 916, in _bootstrap_inner
(pid=21576) self.run()
(pid=21576) File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 102, in run
(pid=21576) err_tb = err_tb.format_exc()
(pid=21576) AttributeError: 'traceback' object has no attribute 'format_exc'
(pid=21576)
2020-02-05 15:50:24,237 ERROR trial_runner.py:497 -- Error processing event.
Traceback (most recent call last):
File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 446, in _process_trial
result = self.trial_executor.fetch_result(trial)
File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 316, in fetch_result
result = ray.get(trial_future[0])
File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/worker.py", line 2197, in get
raise value
ray.exceptions.RayTaskError: ray_worker (pid=21574, host=Gauss)
File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/trainable.py", line 151, in train
result = self._train()
File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 203, in _train
("Wrapped function ran until completion without reporting "
ray.tune.error.TuneError: Wrapped function ran until completion without reporting results or raising an exception.
2020-02-05 15:50:24,242 INFO ray_trial_executor.py:180 -- Destroying actor for trial ray_trainable_0_agent.0.algorithm.center_return=True,trial_index=0. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2020-02-05 15:50:24,269 ERROR trial_runner.py:497 -- Error processing event.
Traceback (most recent call last):
File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 446, in _process_trial
result = self.trial_executor.fetch_result(trial)
File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 316, in fetch_result
result = ray.get(trial_future[0])
File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/worker.py", line 2197, in get
raise value
ray.exceptions.RayTaskError: ray_worker (pid=21576, host=Gauss)
File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/trainable.py", line 151, in train
result = self._train()
File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/function_runner.py", line 203, in _train
("Wrapped function ran until completion without reporting "
ray.tune.error.TuneError: Wrapped function ran until completion without reporting results or raising an exception.
2020-02-05 15:50:24,271 INFO ray_trial_executor.py:180 -- Destroying actor for trial ray_trainable_1_agent.0.algorithm.center_return=False,trial_index=1. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/1 GPUs
Memory usage on this node: 2.8/16.7 GB
Result logdir: /home/joe/ray_results/reinforce_baseline_cartpole
Number of trials: 2 ({'ERROR': 2})
ERROR trials:
- ray_trainable_0_agent.0.algorithm.center_return=True,trial_index=0: ERROR, 1 failures: /home/joe/ray_results/reinforce_baseline_cartpole/ray_trainable_0_agent.0.algorithm.center_return=True,trial_index=0_2020-02-05_15-50-22gyjszidn/error_2020-02-05_15-50-24.txt
- ray_trainable_1_agent.0.algorithm.center_return=False,trial_index=1: ERROR, 1 failures: /home/joe/ray_results/reinforce_baseline_cartpole/ray_trainable_1_agent.0.algorithm.center_return=False,trial_index=1_2020-02-05_15-50-22wwmokuvt/error_2020-02-05_15-50-24.txt
Traceback (most recent call last):
File "run_lab.py", line 80, in <module>
main()
File "run_lab.py", line 72, in main
read_spec_and_run(*args)
File "run_lab.py", line 56, in read_spec_and_run
run_spec(spec, lab_mode)
File "run_lab.py", line 35, in run_spec
Experiment(spec).run()
File "/home/joe/SLM-Lab/slm_lab/experiment/control.py", line 203, in run
trial_data_dict = search.run_ray_search(self.spec)
File "/home/joe/SLM-Lab/slm_lab/experiment/search.py", line 124, in run_ray_search
server_port=util.get_port(),
File "/home/joe/anaconda3/envs/lab2/lib/python3.6/site-packages/ray/tune/tune.py", line 265, in run
raise TuneError("Trials did not complete", errored_trials)
ray.tune.error.TuneError: ('Trials did not complete', [ray_trainable_0_agent.0.algorithm.center_return=True,trial_index=0, ray_trainable_1_agent.0.algorithm.center_return=False,trial_index=1])
So it's still happening and and CUDA is still getting initialized. Okay let's go back to the original lab
environment.
Found another clue: ray is allocating no GPU (expected) but it is trying to get a cuda device. Let's do a temporary hack to see. This is where CUDA is being called:
Can u go to this file on your local and comment out the line https://github.com/kengz/SLM-Lab/blob/master/slm_lab/experiment/search.py#L54
then replace with another line gpu_count = 0
This is to prevent any cuda call at all, hopefully with that muted it'll fix your issue.
Same issue here. Here is the traceback for reinforce_cartpole search (Code 2.13):
Traceback (most recent call last): File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 446, in _process_trial result = self.trial_executor.fetch_result(trial) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 316, in fetch_result result = ray.get(trial_future[0]) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/worker.py", line 2197, in get raise value ray.exceptions.RayTaskError: [36mray_worker[39m (pid=25556, host=XXXXXXX-YYY) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/trainable.py", line 151, in train result = self._train() File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 203, in _train ("Wrapped function ran until completion without reporting " ray.tune.error.TuneError: Wrapped function ran until completion without reporting results or raising an exception.
This seems to be a directory access privilege issue which could be related to ray. When I run in sudo mode on Mac (Catalina 10.15.3) search tasks run fine: I confirmed on reinforce_cartpole, reinforce_baseline_cartpole and sarsa_epsilon_greedy_cartpole.
Any thoughts?
Hi @wonderwide did you also get the same error that's causing ray workers to fail?
(pid=21576) terminate called after throwing an instance of 'c10::Error'
(pid=21576) what(): CUDA error: initialization error (getDevice at /opt/conda/conda-bld/pytorch_1556653099582/work/c10/cuda/impl/CUDAGuardImpl.h:35)
Or does your error log from the Python process (as opposed to ray) say something else?
@kengz I don't see that error. I did set gpu_count = 0 since I don't have cuda. Please see the two logs attached:
python run_lab.py slm_lab/spec/benchmark/reinforce/reinforce_cartpole.json reinforce_cartpole search > reinforce_cartpole_run_log.txt
@wonderwide I extracted PID 31757 from your log and this is the stack trace:
(pid=31757) [2020-03-16 08:24:52,386 PID:31797 INFO __init__.py log_summary] Trial 2 session 0 reinforce_cartpole_t2_s0 [train_df] epi: 888 t: 87 wall_t: 74 opt_step: 4440 frame: 100000 fps: 1351.35 total_reward: 200 total_reward_ma: 166.1 loss: 0.00348898 lr: 0.002 explore_var: nan entropy_coef: 0.001 entropy: 0.569165 grad_norm: nan
(pid=31757) [2020-03-16 08:24:52,464 PID:31797 INFO __init__.py log_metrics] Trial 2 session 0 reinforce_cartpole_t2_s0 [train_df metrics] final_return_ma: 166.1 strength: 144.24 max_strength: 178.14 final_strength: 178.14 sample_efficiency: 2.29886e-05 training_efficiency: 0.00031943 stability: 0.946214
(pid=31757) Fatal Python error: Segmentation fault
(pid=31757)
(pid=31757) Stack (most recent call first):
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/urllib/request.py", line 2586 in proxy_bypass_macosx_sysconf
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/urllib/request.py", line 2610 in proxy_bypass
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/utils.py", line 745 in should_bypass_proxies
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/utils.py", line 761 in get_environ_proxies
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/sessions.py", line 700 in merge_environment_settings
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/sessions.py", line 524 in request
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/api.py", line 60 in request
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/api.py", line 116 in post
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/plotly/io/_orca.py", line 1233 in request_image_with_retrying
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/retrying.py", line 200 in call
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/retrying.py", line 49 in wrapped_f
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/plotly/io/_orca.py", line 1326 in to_image
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/plotly/io/_orca.py", line 1513 in write_image
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/lib/viz.py", line 122 in save_image
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/lib/viz.py", line 155 in plot_session
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/analysis.py", line 256 in analyze_session
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experi
(pid=31757) ment/control.py", line 118 in run
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 26 in mp_run_session
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/process.py", line 99 in run
(pid=31757) File "/Users/XXXX/opt/anacond
(pid=31757) a3/envs/lab/lib/python3.7/multiprocessing/process.py", line 297 in _bootstrap
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/
(pid=31757) popen_fork.py", line 74 in _launch
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/popen_fork.py", line 20 in __init__
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/context.py", line 277 in _Popen
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/context.py", line 223 in _Popen
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/process.py", line 112 in star
(pid=31757) t
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 144 in parallelize_sessions
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 158 in run_sessions
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control
(pid=31757) .py", line 178 in run
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/search.py", line 91 in ray_trainable
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/pytFatal Python error: Segmentation fault
(pid=31757)
(pid=31757) Stack (most recent call first):
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/labhon3./lib/python3.7/7/siurte-packages/ray/tune/function_runner.py", line 249 in _trainablllib/request.py", line 2586 in proxy_bype_func
(pid=31757) File "/Uass_ma
(pid=31757) cosx_s
(pid=31757) ysconf
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/psers/XXXX/opython3.7/urllib/request.py
(pid=31757) ", line 2610 in proxy_bt/anaconda3/envsypass
(pid=31757) File "/Users/XXXX/opt/anaconda3//envs/lab/lib/python3.7/slab/lib/python3.
(pid=31757) ite-packages/requests/utils7/site-packages/ray/tune/function_runner.py", line 141 in entrypoint
(pid=31757) File "/.py", line 745 in should_bypass_proxies
(pid=31757) File "/Users/XXXX/oUsers/XXXX/opt/anaconda3/envs/lab/lib/python3.7/sitpt/anaconda3/e-packages/raenvs/lab/lib/python3.7/site-packages/reqy/tune/function_runner.py", line 90 in run
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/threading.py", line 917 in _bootstrap_inner
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/threading.py", line 885 in _bootstrap
(pid=31757) ues
(pid=31757) ts/utils.py", line 761 in get_environ_proxies
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/sessions.py", line 700 in merge_environment_settings
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/sessions.py", line 524 in request
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/api.py", line 60 in request
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/api.py", line 116 in post
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/plotly/io/_orca.py", line 1233 in requ
(pid=31757) e
(pid=31757) st_image_with_retrying
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/retrying.py", line 200 in call
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/retrying.py", line 49 in wrapped_f
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/plotly/io/_orca.py", line 1326 in to_image
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/plotly/io/_orca.py", line 1513 in write_image
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/lib/viz.py", line 122 in save_image
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/lib/viz.py", line 155 in plot_session
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/analysis.py", line 256 in analyze_session
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 118 in run
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 26 in mp_run_session
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/process.py", line 99 in run
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/process.py", line 297 in _bootstrap
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/popen_fork.py", line 74 in _launch
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/popen_fork.py", line 20 in __init__
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/context.py", line 277 in _Popen
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/context.py", line 223 in _Popen
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/process.py", line 112 in start
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 144 in parallelize_sessions
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 158 in run_sessions
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 178 in run
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/search.py", line 91 in ray_trainable
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 249 in _trainable_func
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/
(pid=31757) lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 141 in entrypoint
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 90 in run
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/threading.py", line 917 in _bootstrap_inner
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/threading.py", line 885 in _bootstrap
(pid=31757) Fatal Python error: Segmentation fault
(pid=31757)
(pid=31757) Stack (most recent call first):
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/urllib/request.py", line 2586 in proxy_bypass_macosx_sysconf
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/urllib/request.py", line 2610 in proxy_bypass
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/utils.py", line 745 in should_bypass_proxies
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/utils.py", line 761 in get_environ_proxies
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/sessions.py", line 700 in merge_environment_settings
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/sessions.py", line 524 in request
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/api.py", line 60 in request
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/api.py", line 116 in post
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/plotly/io/_orca.py", line 1233 in request_image_with_retrying
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/retrying.py", line 200 in call
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/retrying.py", line 49 in wrapped_f
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/plotly/io/_orca.py", line 1326 in to_image
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/plotly/io/_orca.py", line 1513 in write_image
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/lib/viz.py", line 122 in save_image
(pid=31757) File "/Users/XXXX/Download
(pid=31757) s/SLM-Lab/slm_lab/lib/viz.py", line 155 in plot_session
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/analysis.py", line 256 in analyze_session
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 118 in run
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 26 in mp_run_session
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/process.py", line 99 in run
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/mul
(pid=31757) tiprocessing/process.py", line 297 in _bootstrap
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/popen_fork.py", line 74 in _launch
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/popen_fork.py", line 20 in __init__
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/context.py", line 277 in _Popen
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/context.py", line 223 in _Popen
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/process.py", line 112 in start
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 144 in parallelize_sessions
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 158 in run_sessions
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 178 in run
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/search.py", line 91 in ray_trainable
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 249 in _trainable_func
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 141 in entrypoint
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 90 in run
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/threading.py", line 917 in _bootstrap_inner
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/threading.py", line 885 in _bootstrap
(pid=31757) [2020-03-16 08:24:54,032 PID:31804 INFO __init__.py log_summary] Trial 2 session 2 reinforce_cartpole_t2_s2 [train_df] epi: 1431 t: 121 wall_t: 75 opt_step: 7155 frame: 100000 fps: 1333.33 total_reward: 200 total_reward_ma: 172.8 loss: 0.00195152 lr: 0.002 explore_var: nan entropy_coef: 0.001 entropy: 0.566016 grad_norm: nan
(pid=31757) [2020-03-16 08:24:54,058 PID:31804 INFO __init__.py log_metrics] Trial 2 session 2 reinforce_cartpole_t2_s2 [train_df metrics] final_return_ma: 172.8 strength: 150.94 max_strength: 178.14 final_strength: 178.14 sample_efficiency: 2.76552e-05 training_efficiency: 0.00017381 stability: 0.923381
(pid=31757) Fatal Python error: Segmentation fault
(pid=31757)
(pid=31757) Stack (most recent call first):
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/urllib/request.py", line 2586 in proxy_bypass_macosx_sysconf
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/urllib/request.py", line 2610 in proxy_bypass
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/utils.py", line 745 in should_bypass_proxies
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/utils.py", line 761 in get_environ_proxies
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/sessions.py", line 700 in merge_environment_settings
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/sessions.py", line 524 in request
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/api.py", line 60 in request
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/requests/api.py", line 116 in post
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/plotly/io/_orca.py", line 1233 in request_image_with_retrying
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/retrying.py", line 200 in call
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/retrying.py", line 49 in wrapped_f
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/plotly/io/_orca.py", line 1326 in to_image
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/plotly/io/_orca.py", line 1513 in write_image
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/lib/viz.py", line 122 in save_image
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/lib/viz.py", line 155 in plot_session
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/analysis.py", line 256 in analyze_session
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 118 in run
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 26 in mp_run_session
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/process.py", line 99 in run
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/process.py", line 297 in _bootstrap
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/popen_fork.py", line 74 in _launch
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/popen_fork.py", line 20 in __init__
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/context.py", line 277 in _Popen
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/context.py", line 223 in _Popen
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/multiprocessing/process.py", line 112 in start
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 144 in parallelize_sessions
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 158 in run_sessions
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 178 in run
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/search.py", line 91 in ray_trainable
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 249 in _trainable_func
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 141 in entrypoint
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 90 in run
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/threading.py", line 917 in _bootstrap_inner
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/threading.py", line 885 in _bootstrap
(pid=31757) 2020-03-16 08:24:55,124 ERROR function_runner.py:96 -- Runner Thread raised error.
(pid=31757) Traceback (most recent call last):
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 90, in run
(pid=31757) self._entrypoint()
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 141, in entrypoint
(pid=31757) return self._trainable_func(config, self._status_reporter)
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 249, in _trainable_func
(pid=31757) output = train_func(config, reporter)
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/search.py", line 91, in ray_trainable
(pid=31757) metrics = Trial(spec).run()
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 181, in run
(pid=31757) metrics = analysis.analyze_trial(self.spec, session_metrics_list)
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/analysis.py", line 265, in analyze_trial
(pid=31757) trial_metrics = calc_trial_metrics(session_metrics_list, info_prepath)
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/analysis.py", line 187, in calc_trial_metrics
(pid=31757) frames = session_metrics_list[0]['local']['frames']
(pid=31757) IndexError: list index out of range
(pid=31757) Exception in thread Thread-1:
(pid=31757) Traceback (most recent call last):
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 90, in run
(pid=31757) self._entrypoint()
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 141, in entrypoint
(pid=31757) return self._trainable_func(config, self._status_reporter)
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 249, in _trainable_func
(pid=31757) output = train_func(config, reporter)
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/search.py", line 91, in ray_trainable
(pid=31757) metrics = Trial(spec).run()
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/control.py", line 181, in run
(pid=31757) metrics = analysis.analyze_trial(self.spec, session_metrics_list)
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/analysis.py", line 265, in analyze_trial
(pid=31757) trial_metrics = calc_trial_metrics(session_metrics_list, info_prepath)
(pid=31757) File "/Users/XXXX/Downloads/SLM-Lab/slm_lab/experiment/analysis.py", line 187, in calc_trial_metrics
(pid=31757) frames = session_metrics_list[0]['local']['frames']
(pid=31757) IndexError: list index out of range
(pid=31757)
(pid=31757) During handling of the above exception, another exception occurred:
(pid=31757)
(pid=31757) Traceback (most recent call last):
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/threading.py", line 917, in _bootstrap_inner
(pid=31757) self.run()
(pid=31757) File "/Users/XXXX/opt/anaconda3/envs/lab/lib/python3.7/site-packages/ray/tune/function_runner.py", line 102, in run
(pid=31757) err_tb = err_tb.format_exc()
(pid=31757) AttributeError: 'traceback' object has no attribute 'format_exc'
(pid=31757)
The first few lines point to an orca proxy request when trying to generate a graph (orca
is an image server for saving Plotly graphs), and this causes a segfault. I would recommend updating orca and try again:
conda activate lab
conda uninstall plotly-orca
conda install plotly-orca=1.3.0
@xombio seems like your Nvidia driver and CUDA are new, but pytorch version is 1.1.0. Could you update to pytorch 1.3.0 and retry?
conda activate lab
conda uninstall pytorch
conda install -c pytorch pytorch=1.3.0
@kengz Thanks! Yes, there was an issue with orca. I reinstalled it as you suggested, but the result was the same.
After some experimenting, here is what I found out:
when running without plotly-orca (after running conda uninstall plotly-orca), the search will complete, but won't generate png files. This should be fine since csv files have all the data.
with plotly-orca, after setting up no_proxy environment variable:
_export noproxy='*'
the search will succeed, and the png files will be generated. Depending on the version of python, you may get some deprecation warnings (I am running Python 3.7).
Perhaps it would make sense to introduce an option to turn graphing off?
graphs should always be generated because they are crucial for evaluating performance, but in case of failure there's a general try-catch in place to ensure the main loop runs without failing, so it catches any tolerable errors and the graphs can be generated retrospectively.
Because of this catchall, if something ever crashes the program (like getting a segfault) it is a critical issue that needs to be fixed and cannot be fixed by introducing a flag alone (since the flag would be at where that try-catch loop is).
This being the case, an option to turn off graphing is not a fix to the problem itself. However, the latest master branch is updated with the latest version of plotly and orca that fixes the crash, so if you do a git pull you'll have the update that works.
Regarding the no_proxy
option, are you running the job behind a proxied network? Do you know why setting that option works?
Closing as resolved.
Describe the bug I'm enjoying the book a lot. The best book on the subject and I've read Sutton & Barto, but I'm an empiricist and not an academic. Anyway, I can run all the examples in the book in 'dev' and 'train' modes but not in 'search' mode. They all end with error. I don't see anybody complaining about this so it must be a rooky mistake on my part. I hope you can help so I can continue enjoying the book to its fullest.
To Reproduce
git rev-parse HEAD
to get it): What?spec
file used: benchmark/reinforce/reinforce_cartpole.jsonAdditional context I'm showing the error logs for Code 2.15 in page 50, but I get similar error logs for all the other codes ran in 'search' mode. There are 32 files in the 'data' folder, no plots. All the folders in the 'data' folder are empty except for 'log' which has a file with this
NVIDIA drive version: 440.33.01 CUDA version: 10.2
Error logs