higgsfield / RL-Adventure

Pytorch Implementation of DQN / DDQN / Prioritized replay/ noisy networks/ distributional values/ Rainbow/ hierarchical RL
2.99k stars 587 forks source link

Environment/dependencies #19

Closed janscholten closed 5 years ago

janscholten commented 5 years ago

Hi all,

I am currently trying to run the 'quantile regression dqn' notebook, but it breaks in the training stage at line loss = compute_td_loss(batch_size). At some point I realised I actually have no idea if it could be due to my environment, and I wasn't able to verify this with the documentation.

Could someone please report which python/torch versions were succesfully tested? I'd be grateful to be able to put this nice code to work!

Best regards,

Jan

janscholten commented 5 years ago

For those who care, this would specify my current setup: Linux 18.04 Python 3.5.2 Continuum Analytics, Inc. pytorch 0.1.9

and furthermore I am using conda with the following packages: conda list output: packages in environment at /home/jan/anaconda3/envs/rllab3:

Name Version Build Channel absl-py 0.2.2 astor 0.6.2 atari-py 0.1.1 atomicwrites 1.1.5 attrs 18.1.0 awscli 1.15.49 backcall 0.1.0 py35_0
bleach 2.1.3 py35_0
boto3 1.7.48 botocore 1.10.48 ca-certificates 2018.03.07 0
cached-property 1.4.3 certifi 2018.4.16 py35_0
cffi 1.11.5 py35h9745a5d_0
chainer 1.18.0 chardet 3.0.4 click 6.6 py35_0 jjhelmus cloudpickle 0.5.3 colorama 0.3.9 coverage 4.5.1 cycler 0.10.0 py35hc4d5149_0
Cython 0.28.3 dbus 1.13.2 h714fa37_1
decorator 4.3.0 py35_0
docutils 0.14 entrypoints 0.2.3 py35h48174a2_2
enum34 1.1.6 expat 2.2.5 he0dffb1_0
filelock 3.0.4 flask 1.0.2 py35_1
fontconfig 2.12.1 3
freetype 2.5.5 2
future 0.16.0 gast 0.2.0 glib 2.56.1 h000015b_0
gmp 6.1.2 h6c8ec71_1
grpcio 1.13.0 gst-plugins-base 1.14.0 hbbd80ab_1
gstreamer 1.14.0 hb453b48_1
gym 0.7.4 h5py 2.8.0 py35ha1f6525_0
hdf5 1.10.2 hba1933b_1
html5lib 1.0.1 py35h2f9c1c0_0
hyperopt 0.1 icu 54.1 0
idna 2.7 intel-openmp 2018.0.3 0
ipdb 0.11 ipykernel 4.8.2 py35_0
ipython 6.4.0 py35_0
ipython_genutils 0.2.0 py35hc9e07d0_0
ipywidgets 5.1.5 py35_0 menpo itsdangerous 0.24 py35h7c46880_1
jedi 0.12.0 py35_1
jinja2 2.10 py35h480ab6d_0
jmespath 0.9.3 joblib 0.10.3 py35_0 menpo jpeg 9b h024ee3a_2
jsonschema 2.6.0 py35h4395190_0
jupyter 1.0.0 jupyter-console 5.2.0 jupyter_client 5.2.3 py35_0
jupyter_core 4.4.0 py35ha89e94b_0
Keras 1.2.1 Lasagne 0.2.dev1 libffi 3.2.1 hd88cf55_4
libgcc 7.2.0 h69d50b8_2
libgcc-ng 7.2.0 hdf63c60_3
libgfortran 3.0.0 1
libgfortran-ng 7.2.0 hdf63c60_3
libiconv 1.14 0
libpng 1.6.34 hb9fc6fc_0
libsodium 1.0.16 h1bed415_0
libstdcxx-ng 7.2.0 hdf63c60_3
libtiff 4.0.9 he85c1e1_1
libxcb 1.13 h1bed415_1
libxml2 2.9.4 0
line-profiler 2.1.2 llvmlite 0.23.2 py35hdbcaa40_0
mako 1.0.7 py35h69899ea_0
Markdown 2.6.11 markupsafe 1.0 py35h4f4fcf6_1
matplotlib 2.0.2 np112py35_0
mistune 0.8.3 py35h14c3975_1
mkl 2017.0.4 h4c4d0af_0
mock 2.0.0 more-itertools 4.2.0 mpi4py 2.0.0 py35_2
mpich2 1.4.1p1 0
msgpack-python 0.5.6 mujoco-py 0.5.7 nbconvert 5.3.1 py35hc5194e3_0
nbformat 4.4.0 py35h12e6e07_0
networkx 2.1 nibabel 2.1.0 nine 1.0.0 nose 1.3.7 nose2 0.7.4 notebook 5.5.0 py35_0
numba 0.38.1 py35h04863e7_0
numpy 1.12.0 py35_0
numpy-stl 2.2.0 olefile 0.45.1 py35_0
opencv3 3.1.0 py35_0 menpo openssl 1.0.2o h20670df_0
pandas 0.23.1 py35h637b7d7_0
pandoc 2.2.1 h629c226_0
pandocfilters 1.4.2 py35h1565a15_1
parso 0.2.1 py35_0
path.py 11.0.1 py35_0
pbr 4.0.4 pcre 8.42 h439df22_0
pexpect 4.6.0 py35_0
pickleshare 0.7.4 py35hd57304d_0
pillow 4.2.1 py35_0
pip 10.0.1 py35_0
plotly 1.9.6 pluggy 0.6.0 polling 0.3.0 prettytensor 0.6.2 progressbar2 3.38.0 prompt_toolkit 1.0.15 py35hc09de7a_0
protobuf 3.6.0 ptyprocess 0.6.0 py35_0
py 1.5.4 pyasn1 0.4.3 pybox2d 2.3.1post2 py35_0 kne pycparser 2.18 py35h61b3040_1
pygame 1.9.2a0 py35_0 kne pyglet 1.3.2 pygments 2.2.0 py35h0f41973_0
pylru 1.0.9 pymongo 3.7.0 PyOpenGL 3.1.0 pyparsing 2.2.0 py35h041ed72_1
PyPrind 2.11.2 pyqt 5.6.0 py35h0e41ada_5
pytest 3.6.2 python 3.5.2 0
python-dateutil 2.7.3 py35_0
python-utils 2.3.0 pytorch 0.1.9 py35_2 soumith pytz 2018.4 py35_0
PyYAML 3.12 pyzmq 17.0.0 py35h14c3975_0
qt 5.6.2 5
qtconsole 4.3.1 readline 6.2 2
redis 2.10.6 requests 2.19.1 rsa 3.4.2 s3transfer 0.1.13 scikit-learn 0.19.0 np112py35_0
scipy 0.19.1 np112py35_0
send2trash 1.5.0 py35_0
setuptools 39.2.0 py35_0
simplegeneric 0.8.1 py35_2
sip 4.18.1 py35h9eaea60_2
six 1.11.0 py35h423b573_1
sqlite 3.13.0 0
tensorboard 1.9.0 tensorflow 1.9.0rc1 tensorflow-gpu 1.0.1 termcolor 1.1.0 terminado 0.8.1 py35_1
testpath 0.3.1 py35had42eaf_0
Theano 0.9.0.dev1 tk 8.5.18 0
torchvision 0.1.6 py35_19 soumith tornado 5.0.2 py35_0
tqdm 4.23.4 traitlets 4.3.2 py35ha522a97_0
urllib3 1.23 wcwidth 0.1.7 py35hcd08066_0
webencodings 0.5.1 py35hb6cf162_1
werkzeug 0.14.1 py35_0
wheel 0.31.1 py35_0
widgetsnbextension 1.2.3 py35_1 menpo xz 5.2.4 h14c3975_4
zeromq 4.2.5 h439df22_0
zlib 1.2.11 ha838bed_2

janscholten commented 5 years ago

Solution found:

Switching to python 3.6 & pytorch 0.4.0 helps! full conda list output:

packages in environment at /home/jan/anaconda3/envs/quan:

Name Version Build Channel atari-py 0.1.1 backcall 0.1.0 py36_0
blas 1.0 mkl
bleach 2.1.3 py36_0
Box2D-kengz 2.3.3 ca-certificates 2018.03.07 0
certifi 2018.4.16 py36_0
cffi 1.11.5 py36h9745a5d_0
cffi 1.11.5 chardet 3.0.4 cudatoolkit 9.0 h13b8566_0
cudnn 7.1.2 cuda9.0_0
cycler 0.10.0 py36h93f1223_0
Cython 0.28.3 dbus 1.13.2 h714fa37_1
decorator 4.3.0 py36_0
entrypoints 0.2.3 py36h1aec115_2
expat 2.2.5 he0dffb1_0
fontconfig 2.12.6 h49f89f6_0
freetype 2.8 hab7d2ae_1
future 0.16.0 glfw 1.6.0 glib 2.56.1 h000015b_0
gmp 6.1.2 h6c8ec71_1
gst-plugins-base 1.14.0 hbbd80ab_1
gstreamer 1.14.0 hb453b48_1
gym 0.10.5 html5lib 1.0.1 py36h2f9c1c0_0
icu 58.2 h9c2bf20_1
idna 2.7 imageio 2.3.0 intel-openmp 2018.0.3 0
ipykernel 4.8.2 py36_0
ipython 6.4.0 py36_0
ipython_genutils 0.2.0 py36hb52b0d5_0
jedi 0.12.0 py36_1
jinja2 2.10 py36ha16c418_0
jpeg 9b h024ee3a_2
jsonschema 2.6.0 py36h006f8b5_0
jupyter_client 5.2.3 py36_0
jupyter_core 4.4.0 py36h7c827e3_0
kiwisolver 1.0.1 py36h764f252_0
libedit 3.1.20170329 h6b74fdf_2
libffi 3.2.1 hd88cf55_4
libgcc-ng 7.2.0 hdf63c60_3
libgfortran-ng 7.2.0 hdf63c60_3
libpng 1.6.34 hb9fc6fc_0
libsodium 1.0.16 h1bed415_0
libstdcxx-ng 7.2.0 hdf63c60_3
libxcb 1.13 h1bed415_1
libxml2 2.9.8 h26e45fe_1
markupsafe 1.0 py36hd9260cd_1
matplotlib 2.2.2 py36h0e671d2_1
mistune 0.8.3 py36h14c3975_1
mkl 2018.0.3 1
mkl_fft 1.0.1 py36h3010b51_0
mkl_random 1.0.1 py36h629b387_0
mujoco-py 1.50.1.56 nbconvert 5.3.1 py36hb41ffb7_0
nbformat 4.4.0 py36h31c9010_0
nccl 1.3.5 cuda9.0_0
ncurses 6.1 hf484d3e_0
ninja 1.8.2 py36h6bb024c_1
notebook 5.5.0 py36_0
numpy 1.14.5 numpy 1.14.5 py36hcd700cb_3
numpy-base 1.14.5 py36hdbf6ddf_3
openssl 1.0.2o h20670df_0
pandoc 2.2.1 h629c226_0
pandocfilters 1.4.2 py36ha6701b7_1
parso 0.2.1 py36_0
pcre 8.42 h439df22_0
pexpect 4.6.0 py36_0
pickleshare 0.7.4 py36h63277f8_0
Pillow 5.1.0 pip 10.0.1 py36_0
prompt_toolkit 1.0.15 py36h17d85b1_0
ptyprocess 0.6.0 py36_0
pycparser 2.18 py36hf9f622e_1
pycparser 2.18 pyglet 1.3.2 pygments 2.2.0 py36h0d3125c_0
PyOpenGL 3.1.0 pyparsing 2.2.0 py36hee85983_1
pyqt 5.9.2 py36h751905a_0
python 3.6.5 hc3d631a_2
python-dateutil 2.7.3 py36_0
pytorch 0.4.0 py36hdf912b8_0
pytz 2018.5 py36_0
pyzmq 17.0.0 py36h14c3975_0
qt 5.9.5 h7e424d6_0
readline 7.0 ha6073c6_4
requests 2.19.1 send2trash 1.5.0 py36_0
setuptools 39.2.0 py36_0
simplegeneric 0.8.1 py36_2
sip 4.19.8 py36hf484d3e_0
six 1.11.0 py36h372c433_1
six 1.11.0 sqlite 3.23.1 he433501_0
terminado 0.8.1 py36_1
testpath 0.3.1 py36h8cadb63_0
tk 8.6.7 hc745277_3
tornado 5.0.2 py36_0
traitlets 4.3.2 py36h674d592_0
urllib3 1.23 wcwidth 0.1.7 py36hdf4376a_0
webencodings 0.5.1 py36h800622e_1
wheel 0.31.1 py36_0
xz 5.2.4 h14c3975_4
zeromq 4.2.5 h439df22_0
zlib 1.2.11 ha838bed_2