mrahtz / learning-from-human-preferences

Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"
MIT License
301 stars 67 forks source link

Extra instructions for Ubuntu #4

Open eggsyntax opened 5 years ago

eggsyntax commented 5 years ago

Hi @mrahtz , thanks for doing this repo! I thought it might be useful to you or others to pass along some extra stuff I had to do to get this running on a fresh Ubuntu 18.04 install. Feel free to delete/close if it's not of use.

https://gist.github.com/eggsyntax/81a511adee360b811dc025508dea4f4a/edit

Here's a paste of what's in the gist:

sudo apt install git
sudo apt install vim
mkdir bin
# Add /home/egg/bin and /home/egg/.local/bin to PATH
vim .bashrc
ln -s /usr/bin/python3 /home/egg/bin/python
python --version
sudo apt install python-pip
pip install virtualenv
pip install --user pipenv
pipenv --version
sudo apt install python3-distutils
sudo apt install cmake
sudo apt install zlib1g-dev
sudo apt install curl
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash

# clone repo and enter its dir, then
pipenv run pip install tensorflow
pipenv install
pipenv shell

# had tried with tensorflow-gpu, using:
sudo apt install libcublas9.1
pipenv run pip install tensorflow-gpu
# but it fails because it needs libcublas10.x, which doesn't seem to be available for Ubuntu 18.04 (as of 2019-03-17). Per docs, could either change tensorflow version or somehow get/build libcublas10.x. For now I'm just punting to using non-gpu tensorflow.
eggsyntax commented 5 years ago

Caveat: I haven't gone further than getting it to run without crashing, so there may turn out to be further steps necessary to get it fully working.

ForrestTrepte commented 3 years ago

Thanks for the instructions, @eggsyntax! I was having errors such as Couldn't install package: atari-py. I needed to run sudo apt-get update and then the sudo apt install commands you listed.

This got me further, but now I'm having errors with matplotlib:

[pipenv.exceptions.InstallError]: ERROR: Could not find a version that satisfies the requirement matplotlib==2.2.2 (from versions: 0.86, 0.86.1, 0.86.2, 0.91.0, 0.91.1, 1.0.1, 1.1.0, 1.1.1, 1.2.0, 1.2.1, 1.3.0, 1.3.1, 1.4.0, 1.4.1rc1, 1.4.1, 1.4.2, 1.4.3, 1.5.0, 1.5.1, 1.5.2, 1.5.3, 2.0.0b1, 2.0.0b2, 2.0.0b3, 2.0.0b4, 2.0.0rc1, 2.0.0rc2, 2.0.0, 2.0.1, 2.0.2, 2.1.0rc1, 2.1.0, 2.1.1, 2.1.2, 2.2.0rc1, 2.2.0, 2.2.2, 2.2.3, 2.2.4, 2.2.5, 3.0.0rc2, 3.0.0, 3.0.1, 3.0.2, 3.0.3, 3.1.0rc1, 3.1.0rc2, 3.1.0, 3.1.1, 3.1.2, 3.1.3, 3.2.0rc1, 3.2.0rc3, 3.2.0, 3.2.1, 3.2.2, 3.3.0rc1, 3.3.0, 3.3.1, 3.3.2, 3.3.3, 3.3.4, 3.4.0rc1, 3.4.0rc2, 3.4.0rc3, 3.4.0, 3.4.1, 3.4.2)
[pipenv.exceptions.InstallError]: ERROR: No matching distribution found for matplotlib==2.2.2
ERROR: Couldn't install package: matplotlib

Any ideas what would cause that?

ForrestTrepte commented 3 years ago

Note that in the above instructions egg must be replaced with your username. Also, you need to log back in again after editing .bashrc to pick up the path changes.

ForrestTrepte commented 3 years ago

Using the above instructions I was successfully able to set up the virtual environment on a clean Ubuntu 18.04 VM. However, now when I try to run I am encountering a ModuleNotFoundError about numpy.core._multiarray_umath. Searching for the error, I found some info but I wasn't able to understand the issues well enough to latch onto anything that would solve my problem. Any ideas?

(learning-from-human-preferences) azureuser@preferences-test-vm:~/learning-from-human-preferences$ python3 run.py train_policy_with_original_rewards PongNoFrameskip-v4 --n_envs 16 --million_timesteps 10
2021-07-11 17:09:05.159104: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-07-11 17:09:05.159147: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'
Traceback (most recent call last):
  File "run.py", line 11, in <module>
    import easy_tf_log
  File "/home/azureuser/.local/share/virtualenvs/learning-from-human-preferences-iMJwDyH4/lib/python3.6/site-packages/easy_tf_log.py", line 5, in <module>
    import tensorflow as tf
  File "/home/azureuser/.local/share/virtualenvs/learning-from-human-preferences-iMJwDyH4/lib/python3.6/site-packages/tensorflow/__init__.py", line 41, in <module>
    from tensorflow.python.tools import module_util as _module_util
  File "/home/azureuser/.local/share/virtualenvs/learning-from-human-preferences-iMJwDyH4/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 40, in <module>
    from tensorflow.python.eager import context
  File "/home/azureuser/.local/share/virtualenvs/learning-from-human-preferences-iMJwDyH4/lib/python3.6/site-packages/tensorflow/python/eager/context.py", line 37, in <module>
    from tensorflow.python.client import pywrap_tf_session
  File "/home/azureuser/.local/share/virtualenvs/learning-from-human-preferences-iMJwDyH4/lib/python3.6/site-packages/tensorflow/python/client/pywrap_tf_session.py", line 23, in <module>
    from tensorflow.python._pywrap_tf_session import *
ImportError: SystemError: <built-in method __contains__ of dict object at 0x7f1fcb3b7120> returned a result with an error set
egg-at-reify commented 3 years ago

Wish I could be of help! It's been over two years since I played with it. That sounds like your numpy either isn't installed or isn't up-to-date. Have you tried doing pip install numpy --upgrade as the linked answer suggests?

ForrestTrepte commented 3 years ago

I did try that, but an error occurred while running the command. (Whoops, I should have saved the output.) And now I have a similar, but different error when running module 'tensorflow.python.pywrap_tensorflow' has no attribute 'EventsWriter'.

azureuser@preferences-test-vm:~/learning-from-human-preferences$  . /home/azureuser/.local/share/virtualenvs/learning-from-human-preferences-iMJwDyH4/bin/activate
(learning-from-human-preferences) azureuser@preferences-test-vm:~/learning-from-human-preferences$ pip install numpy --upgrade
Requirement already satisfied: numpy in /home/azureuser/.local/share/virtualenvs/learning-from-human-preferences-iMJwDyH4/lib/python3.6/site-packages (1.19.5)
(learning-from-human-preferences) azureuser@preferences-test-vm:~/learning-from-human-preferences$ python3 run.py train_policy_with_original_rewards PongNoFrameskip-v4 --n_envs 16 --million_timesteps 10
2021-07-12 16:42:05.404052: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-07-12 16:42:05.404111: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
  File "run.py", line 394, in <module>
    main()
  File "run.py", line 37, in main
    rew_pred_training_params)
  File "run.py", line 135, in run
    a2c_params=a2c_params)
  File "run.py", line 253, in start_policy_training
    configure_a2c_logger(log_dir)
  File "run.py", line 215, in configure_a2c_logger
    tb = logger.TensorBoardOutputFormat(a2c_dir)
  File "/home/azureuser/learning-from-human-preferences/a2c/logger.py", line 108, in __init__
    self.writer = pywrap_tensorflow.EventsWriter(compat.as_bytes(path))
AttributeError: module 'tensorflow.python.pywrap_tensorflow' has no attribute 'EventsWriter'
eggsyntax commented 3 years ago

If you got an error, chances are it didn't actually (or didn't fully) update. You might want to try running that again and post the error you get, happy to check whether it rings any bells for me :)

mrahtz commented 3 years ago

@ForrestTrepte Do you definitely have TensorFlow 1 installed, rather than TensorFlow 2?

ForrestTrepte commented 3 years ago

I appears that I have TensorFlow 2:

>>> print(tf.__version__)
2.5.0

Sorry, I'm not very experienced with Python environment configuration. In the above step pipenv run pip install tensorflow perhaps the repository was originally developed with TensorFlow 1 and in the subsequent time TensorFlow was upgraded to version 2, which isn't compatible.

ForrestTrepte commented 3 years ago

Doing this seems to have solved my problem. Thanks, @mrahtz!

pipenv run pip uninstall tensorflow
pipenv run pip install tensorflow==1.15
jgocm commented 3 years ago

Hi @ForrestTrepte, after theses steps were you able to run any of the examples from the repo?

ForrestTrepte commented 3 years ago

I was able to run python3 run.py train_policy_with_original_rewards PongNoFrameskip-v4 --n_envs 16 --million_timesteps 10. It seems to be working:

Saved policy checkpoint to 'runs/1626371063_002ac98/policy_checkpoints/policy.ckpt-13200'
Trained policy for 1064000 time steps
Saved policy checkpoint to 'runs/1626371063_002ac98/policy_checkpoints/policy.ckpt-13300'
Trained policy for 1072000 time steps
Saved policy checkpoint to 'runs/1626371063_002ac98/policy_checkpoints/policy.ckpt-13400'
Trained policy for 1080000 time steps
Saved policy checkpoint to 'runs/1626371063_002ac98/policy_checkpoints/policy.ckpt-13500'

But I'm new to learning-from-human-preferences so I don't yet understand what I'm doing. That's the next step now that I seem to have the setup working.

jgocm commented 3 years ago

Could you please share your pip list and which specific python version you are using?

I am new to it as well and did all the setup from the repository, but when I try to run the examples it gives me errors, so I wanted to try using same package versions as someone who gets it to work.

ForrestTrepte commented 3 years ago

Sure. I'm attaching a file containing the pip list as well as execution of run.py. When running I do see a lot of warnings but based on the Trained policy for xx000 time steps output I think it is working although I haven't actually dug into what it's doing yet. (I am learning about this cool human preferences project as a hobby on the side and only have a little time for it here and there.) pip-list-and-run.txt

ForrestTrepte commented 3 years ago

There is also the pipfile.lock file in the repository. I think this pins the package versions used by the original author, although I don't know the details of how that works in pipenv.

mrahtz commented 3 years ago

@ForrestTrepte Oh snap, I'd forgotten about the TensorFlow instructions in the README! I've fixed this now, thanks!

And indeed, pipenv should install the specific package versions from pipfile.lock when you run pipenv install.

jgocm commented 3 years ago

Thanks for the help! We are finally able to run it on the Ubuntu.

In addition to the tensorflow instructions, I think it might be helpful to point out that tensorflow<=1.15 requires python<3.8 and the pipfile/piplock only requires for a python==3.X.X, which may run into problems if you are using a python version that is not compatible with tensorflow 1.

This pypi release page shows the python compatible versions for the latest tensorflow 1 release.

mrahtz commented 2 years ago

I've added this to the README. Thanks!