Closed yarncraft closed 3 years ago
Hi @yarncraft , this is really cool!
Looks like all the python tests are failing, so it's probably the case that the Python API (pybind11) is not building properly.
We've seen this a few times. It's usually been one of these:
python3-dev
is not installed.I've never used Docker. Can you tell me what OS you are using and its version?
I've never used Docker. Can you tell me what OS you are using and its version?
Haha, clicked on literally the only link in your message and saw Ubuntu 18.04. Perfect. It's been fairly straight-forward to install OpenSpiel on Ubuntu 18.04. My guess is that you don't have python3-dev
installed. Did you run install.sh
to get all the dependencies?
LOL I just saw now the second line of your Dockerfile installs python3-dev
. Looking at the file, seems fine.
Hmm... do you know if you can import pyspiel from python3 in the Docker instance? Can you try the manual build and instead of running tests, do:
python3
Python 3.7.6 (default, Feb 4 2020, 17:04:58)
[Clang 11.0.0 (clang-1100.0.33.16)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyspiel
>>>
Does this cause an error?
Before doing this , you'll need to set the PYTHONPATH
environment variables, see Step 4 of https://github.com/deepmind/open_spiel/blob/master/docs/install.md
According to step 4 in the docs I read the following:
_To be able to import the Python code (both the C++ binding pyspiel and the rest) from any location, you will need to add to your PYTHONPATH the root directory and the openspiel directory.
I think this is not needed when you put the framework in the container as you would use Docker containers exactly to avoid undertaking such steps. (Since you would just run your python scripts in the containerized environment instead).
However there is a note in step 2 stating the following
Install pip deps as your user. Do not use the system's pip.
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python3 get-pip.py --user
pip3 install --upgrade pip --user
pip3 install --upgrade setuptools testresources --user
So maybe I can try to install pip in this way and check if this resolves the problem. Anyway it seems quite unlikely :)
I think this is not needed when you put the framework in the container as you would use Docker containers exactly to avoid undertaking such steps. (Since you would just run your python scripts in the containerized environment instead).
Right, but currently the python tests are failing within the environment and when you're using build_and_run_tests.sh
which sets up the Python paths for you as necessary. I was asking to try doing the import pyspiel from within the container to see if it was built properly. If the tests are failing, something is going wrong in the container, so we have to diagnose it. I mainly want to know if it's how we're running the Python tests from CMake or if you can't load pyspiel at all (from within the container).
I've never used Docker so I have no idea how to debug these things.
However there is a note in step 2 stating the following
Install pip deps as your user. Do not use the system's pip.
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py python3 get-pip.py --user pip3 install --upgrade pip --user pip3 install --upgrade setuptools testresources --user
This is mostly a recommendation so that users don't mess up their system pip configuration. I doubt this will be it, but one more thing to try.
Ok I'll make sure to check out if I can run a python script whereby I import pyspiel in my containerized environment!
Note that not all tests fail (in fact 57% passes) however the tests that do fail seem to be quite random, therefore it is indeed hard to debug what exactly goes wrong. I'll do a full report on what could be the cause somewhere in the next 24 hours.
Note that not all tests fail (in fact 57% passes) however the tests that do fail seem to be quite random, therefore it is indeed hard to debug what exactly goes wrong. I'll do a full report on what could be the cause somewhere in the next 24 hours.
Right, looks like the C++ tests are passing, so OpenSpiel is building and running properly. From your screenshot it seems to be only the python tests, and we've seen this before so I think it's just that the Python API is not getting build or linked properly (or the CMake tests are not building the python tests properly).
Thanks for taking the initiative to do this! I needed something similar a few months ago. I am no docker expert, so this is certainly not optimal, but the following worked for me. This was part of a larger effort; I've removed some things I know to be irrelevant, but other unneeded items may still be there.
FROM ubuntu:20.04
RUN apt update
RUN dpkg --add-architecture i386 && apt update
RUN apt-get -y install \
clang \
curl \
cmake \
git \
python3 \
python3-dev \
python3-pip \
python3-setuptools \
python3-wheel \
sudo
RUN git clone -b 'master' --single-branch --depth 15
https://github.com/deepmind/open_spiel.git open_spiel
WORKDIR open_spiel
RUN ./install.sh
RUN mkdir -p build && \
cd build && \
cmake -DPython_TARGET_VERSION=${PYVERSION} -DCMAKE_CXX_COMPILER=which clang++
../open_spiel && \
make -j4
RUN pip3 install absl-py scipy
COPY . build
CMD /open_spiel/build/run.sh
@elkhrt Thanks for sharing, to make your Dockerfile future proof you might consider adding a python upgrade step. I do observe that you don't make use of a virtualenv, this might not be needed when you containerize indeed.
I will give this Dockerfile a try in a minute!
@elkhrt It seems like the Dockerfile is working correctly for the TicTacToe example. It does not run tests explicitly however, does it check these internally when executing the last CMD or is it something that needs to be added as well?
Yours is far better / future proof / general. And actually runs the tests, unlike mine. I was just sharing it in case it helped you find what was missing in yours.
On Tue, 3 Mar 2020, 10:27 Lucas Engels, notifications@github.com wrote:
@elkhrt https://github.com/elkhrt It seems like the Dockerfile is working correctly for the TicTacToe example. It does not run tests explicitly however, does it check these internally when executing the last CMD or is it something that needs to be added as well?
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/deepmind/open_spiel/issues/156?email_source=notifications&email_token=AHAF7TG66Z55FYWBF6TMTLDRFTLPNA5CNFSM4K7X3KOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENS5RWY#issuecomment-593877211, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHAF7TE63SV33LTKELG4EE3RFTLPNANCNFSM4K7X3KOA .
Ok I will make sure to enhance the Dockerfile with some extra obligatory steps. I think we can close this issue, I will come back with an update once I get everything up and running with the enhanced version!
Thanks @lanctot @elkhrt for the help!
56% tests passed, 58 tests failed out of 133
Total Test time (real) = 269.80 sec
The following tests FAILED:
66 - python_api_test (Failed)
67 - python_playthrough_test (Failed)
68 - python_action_value_test (Failed)
69 - python_action_value_vs_best_response_test (Failed)
70 - python_best_response_test (Failed)
71 - python_cfr_br_test (Failed)
72 - python_cfr_test (Failed)
73 - python_deep_cfr_test (Failed)
74 - python_discounted_cfr_test (Failed)
75 - python_dqn_test (Failed)
76 - python_eva_test (Failed)
77 - python_evaluate_bots_test (Failed)
78 - python_expected_game_score_test (Failed)
79 - python_exploitability_descent_test (Failed)
80 - python_exploitability_test (Failed)
81 - python_fictitious_play_test (Failed)
82 - python_generate_playthrough_test (Failed)
83 - python_get_all_states_test (Failed)
84 - python_rl_losses_test (Failed)
85 - python_lp_solver_test (Failed)
86 - python_masked_softmax_test (Failed)
87 - python_mcts_test (Failed)
88 - python_minimax_test (Failed)
89 - python_neurd_test (Failed)
90 - python_nfsp_test (Failed)
91 - python_outcome_sampling_mccfr_test (Failed)
92 - python_policy_aggregator_joint_test (Failed)
93 - python_policy_aggregator_test (Failed)
94 - python_policy_gradient_test (Failed)
95 - python_projected_replicator_dynamics_test (Failed)
96 - python_generalized_psro_test (Failed)
97 - python_rectified_nash_response_test (Failed)
98 - python_random_agent_test (Failed)
99 - python_rcfr_test (Failed)
100 - python_sequence_form_lp_test (Failed)
101 - python_value_iteration_test (Failed)
102 - python_bluechip_bridge_uncontested_bidding_test (Failed)
103 - python_uniform_random_test (Failed)
104 - python_alpharank_test (Failed)
105 - python_alpharank_visualizer_test (Failed)
106 - python_dynamics_test (Failed)
107 - python_heuristic_payoff_table_test (Failed)
108 - python_utils_test (Failed)
109 - python_visualization_test (Failed)
110 - python_catch_test (Failed)
111 - python_cliff_walking_test (Failed)
112 - python_data_test (Failed)
113 - python_tic_tac_toe_test (Failed)
114 - python_bot_test (Failed)
115 - python_games_sim_test (Failed)
116 - python_matrix_game_utils_test (Failed)
117 - python_policy_test (Failed)
118 - python_pyspiel_test (Failed)
119 - python_rl_environment_test (Failed)
120 - python_tensor_game_utils_test (Failed)
121 - python_file_logger_test (Failed)
122 - python_lru_cache_test (Failed)
123 - python_examples_bridge_supervised_learning (Failed)
Errors while running CTest
The command '/bin/sh -c mkdir -p build && cd build && cmake -DPython_TARGET_VERSION=${PYVERSION} -DCMAKE_CXX_COMPILER=`which clang++` ../open_spiel && make -j4 && ctest -j4'
@elkhrt, I added tests by running the ctest -j4
command after the build as explained in the installation docs. It seems like your Dockerfile experiences the same troubles as well! So the python build fails both through pip as through a manual compilation it seems.
So the dockerfile I'm using now is
FROM ubuntu:20.04
RUN apt update
RUN dpkg --add-architecture i386 && apt update
RUN apt-get -y install \
clang \
curl \
cmake \
git \
python3 \
python3-dev \
python3-pip \
python3-setuptools \
python3-wheel \
sudo
# clone repository and install
RUN git clone -b 'master' --single-branch --depth 15 https://github.com/deepmind/open_spiel.git open_spiel
WORKDIR open_spiel
RUN ./install.sh
# build and test
RUN mkdir -p build && \
cd build && \
cmake -DPython_TARGET_VERSION=${PYVERSION} -DCMAKE_CXX_COMPILER=`which clang++` ../open_spiel && \
make -j4 && \
ctest -j4
COPY . build
WORKDIR /open_spiel/build
CMD run.sh
When skipping the test command, you can indeed run the examples. However, as explained above, this Dockerfile approach also experiences the same troubles as did mine when it comes down to the Python tests.
PYVERSION
is something our script defines here: https://github.com/deepmind/open_spiel/blob/b19852be38e65de2db20dc9be6659e522a72e83d/open_spiel/scripts/build_and_run_tests.sh#L89, it's not defined by default.
So you either need to run the same command (in addition to the one that defines PYBIN
) before running the test or hard-code the version number. (Ubuntu 18.04 comes with Python 3.6 but I see you do an upgrade so you might have 3.7-- you can find out which version you have using dpkg --list | grep python
or from within the Python interpreter using import sys; print(sys.version)
)
Edit: noticed you moved to Ubuntu 20.04 which has Python version 3.8.
Also looks like your new file is missing the installing the required pip packages..?
If it helps, my go-to minimal manual install is the one on page 6 of the paper: https://arxiv.org/abs/1908.09453
I am now working with the Dockerfile provided by Lockhart, the Dockerfile uses: Python 3.8.2 pip 18.1
I am now working with the Dockerfile provided by Lockhart, the Dockerfile uses: Python 3.8.2 pip 18.1
Cool. But you still need to install OpenSpiel's python dependencies via pip3 install --upgrade -r requirements.txt
FROM ubuntu:20.04
RUN apt update
RUN dpkg --add-architecture i386 && apt update
RUN apt-get -y install \
clang \
curl \
cmake \
git \
python3 \
python3-dev \
python3-pip \
python3-setuptools \
python3-wheel \
sudo
# clone repository and install
RUN git clone -b 'master' --single-branch --depth 15 https://github.com/deepmind/open_spiel.git open_spiel
WORKDIR open_spiel
RUN ./install.sh
RUN pip3 install --upgrade pip
RUN pip3 install --upgrade setuptools testresources
RUN pip3 install --upgrade -r requirements.txt
# build and test
RUN mkdir -p build && \
cd build && \
cmake -DPython_TARGET_VERSION=${PYVERSION} -DCMAKE_CXX_COMPILER=`which clang++` ../open_spiel && \
make -j4 && \
ctest -j4
COPY . build
WORKDIR /open_spiel/build
CMD run.sh
results in
Step 10/14 : RUN pip3 install --upgrade -r requirements.txt
---> Running in 512b2a97f247
Requirement already satisfied: pip>=20.0.2 in /usr/local/lib/python3.8/dist-packages (from -r requirements.txt (line 2)) (20.0.2)
Collecting absl-py==0.9.0
Downloading absl-py-0.9.0.tar.gz (104 kB)
ERROR: Could not find a version that satisfies the requirement tensorflow<2.0,>=1.15.1 (from -r requirements.txt (line 4)) (from versions: none)
ERROR: No matching distribution found for tensorflow<2.0,>=1.15.1 (from -r requirements.txt (line 4))
The command '/bin/sh -c sudo pip3 install --upgrade -r requirements.txt' returned a non-zero code: 1
Ah man, not this again. This error was a pain to fix a few months back when we upgraded to TF 1.15. Maybe it is back to haunt us in Ubuntu 20.04.
Might not work by the system's pip. I wonder if the user-based pip is necessary now.
I will test an install independently on a Ubuntu 20.04 machine and report back. Might be a few days.
FROM ubuntu:18.04
RUN apt update
RUN dpkg --add-architecture i386 && apt update
RUN apt-get -y install \
clang \
curl \
git \
python3 \
python3-dev \
python3-pip \
python3-setuptools \
python3-wheel \
sudo
RUN sudo pip3 install --upgrade pip
RUN sudo pip3 install matplotlib
# clone repository and install
RUN git clone -b 'master' --single-branch --depth 15 https://github.com/deepmind/open_spiel.git open_spiel
WORKDIR open_spiel
RUN ./install.sh
RUN pip3 install --upgrade setuptools testresources
RUN pip3 install --upgrade -r requirements.txt
RUN pip3 install --upgrade cmake
# build and test
RUN mkdir -p build && \
cd build && \
cmake -DPython_TARGET_VERSION=${PYVERSION} -DCMAKE_CXX_COMPILER=`which clang++` ../open_spiel && \
make -j4 && \
ctest -j4
COPY . build
RUN python3 ./open_spiel/python/examples/matrix_game_example.py
WORKDIR /open_spiel/build
CMD run.sh
I managed to resolve the dependency issue by switching back to 18.04 and installing make through pip (since that downloads the latest version in contrary to apt-get). I am now running the tests again and I'm checking if I can import pyspiel in a Python script
Ok I expect that last python3 command to fail (and a manual import pyspiel from the interpreter) unless you set the PYTHONPATH environment variables. Tests should be ok because they are set within the CMakeLists.txt IIRC.
100% tests passed, 0 tests failed out of 133
Total Test time (real) = 1053.32 sec
OK so the tests are ok, the script indeed still fails, so I'm adding the PYTHONPATH and everything should work properly in that case
I haven't spent too much time going through the conversation above but we've successfully dockerized OpenSpiel and hence I'm pasting the relevant docker files here in case they are of any help.
~800MB
pyspiel
, image size ~120MB
You can see how the container from 2. is built within a container from 1. in the circleci config. Two docker files are used to obtain a much smaller final container. Also, containers are based on debian:buster-slim
so not sure whether this will be completely relevant for your problems; feel free to ignore in that case π. It's safe to ignore non-open-spiel related things within the project.
Ok thanks for sharing, I just got it working with the following setup:
FROM ubuntu:18.04
RUN apt update
RUN dpkg --add-architecture i386 && apt update
RUN apt-get -y install \
clang \
curl \
git \
python3 \
python3-dev \
python3-pip \
python3-setuptools \
python3-wheel \
sudo
RUN sudo pip3 install --upgrade pip
RUN sudo pip3 install matplotlib
# clone repository and install
RUN git clone -b 'master' --single-branch --depth 15 https://github.com/deepmind/open_spiel.git open_spiel
WORKDIR open_spiel
RUN ./install.sh
# install Python dependencies
RUN pip3 install --upgrade setuptools testresources
RUN pip3 install --upgrade -r requirements.txt
RUN pip3 install --upgrade cmake
# build and test
RUN mkdir -p build && \
cd build && \
cmake -DPython_TARGET_VERSION=${PYVERSION} -DCMAKE_CXX_COMPILER=`which clang++` ../open_spiel && \
make -j4 && \
ctest -j4
COPY . build
ENV PYTHONPATH=${PYTHONPATH}:/open_spiel/
ENV PYTHONPATH=${PYTHONPATH}:/open_spiel/build/python
WORKDIR /open_spiel/build
Amazing! Thanks for working on this.
Please submit a PR. Many people would benefit from this, so we should have it somewhere!
Indeed! I am working on it! π
PR submitted π―Being able to easily run Reinforcement Learning projects in the cloud will most certainly be beneficial!
@lanctot The PR still needs to be merged on the master, is there a problem still?
No problem, I just got busy with work yesterday so couldn't do it yet.
We have a weekly update cycle: we import the PR internally and then it gets merged in our weekly push back to github (on Mondays).
So I will import it in the next few days and it will be merged on March 9th.
I am wondering if you could try something for me though.
My server provider doesnt have Ubuntu 20.04 yet but I would like if we could fix the problem you faced before its release. I suspect if we bump the TF requirement to 2.0 I believe it will work (in requirements.txt
).
Is it easy for you to just test out for me? Or share the entire Ubuntu 20.04 file too so I can try it?
Thanks for the update! π I will check if the system still works on Ubuntu 20.04 with TF 2.0 as soon as possible
I tried internally and the upgrade to TF 2.0 breaks ~10 tests so if that's the solution we'll have to update some code first. When I can get an Ubuntu 20.04 machine then I'll look for a shorter-term solution (unless we've fixed those tests by then).
Ok as soon as you've updated the code I will try to run it in the Ubuntu 20.04. You should give Docker a try as well, it's very easy to install on whatever OS you're using! In that case you can just use the Dockerfile I provided and switch from 18.04 to 20.04. (if you can write shell scripts, you can write Dockerfiles)
@lanctot, I'm glad I could contribute, it seems like we're about to have a presentation from you for our Machine Learning project at the University of Leuven.
Indeed, I am looking forward to it. I was wondering if you were in that class. Is that how you found out about OpenSpiel?
Yes that is how I found out about OpenSpiel indeed, many people of the course struggled with getting it up and running on various operating systems so that's why I built a Dockerfile asap πI already mentioned the new way of installation on our forum
Just an FYI. I tried installing openspiel using the docker container from the version 0.1.0 package. It came up with this error. It seems like this one failed test causes the docker container to not be created. Since it passes 99% of tests, I just deleted the testing part of the dockerfile and have successfully created the docker container.
Assuming that since docker is supposed to take care of dependencies and stuff (though it's my first time using it) that this is not an issue on my end (using the latest version of docker).
99% tests passed, 1 tests failed out of 151
Total Test time (real) = 1504.72 sec
The following tests FAILED:
140 - python_examples_bridge_supervised_learning (Failed)
Errors while running CTest
The command '/bin/sh -c ctest -j12' returned a non-zero code: 8
Thanks, we will look into it. Re-opening as a reminder. Tagging @elkhrt @yarncraft just so they know.
If all the other tests pass then I guess it will still work. That is currently the only use of Jax, it is possible the dependencies did not work out.
Just an FYI. I tried installing openspiel using the docker container from the version 0.1.0 package. It came up with this error. It seems like this one failed test causes the docker container to not be created. Since it passes 99% of tests, I just deleted the testing part of the dockerfile and have successfully created the docker container.
Assuming that since docker is supposed to take care of dependencies and stuff (though it's my first time using it) that this is not an issue on my end (using the latest version of docker).
99% tests passed, 1 tests failed out of 151 Total Test time (real) = 1504.72 sec The following tests FAILED: 140 - python_examples_bridge_supervised_learning (Failed) Errors while running CTest The command '/bin/sh -c ctest -j12' returned a non-zero code: 8
This is indeed normal behavior and not an error on your side. The testing part can be commented out for faster builds, most of the framework will behave correctly given the 99% passing grade. @lanctot, is the Jax dependency not part of the Openspiel installation? If this error is due to a missing dependency I can quickly fix it by adding it to the container spec.
I've been working on putting OpenSpiel in Docker container (since it always greatly boosts framework adoption). So far I can get the tests to run, but there are still some tests prone to failure. Is there anyone else with some Docker experience that spots the problem in the Dockerfile.
Dockerfile Reference: https://github.com/yarncraft/DockerizedOpenSpiel
Thanks in advance!