Open szhaovas opened 5 months ago
Hi @szhaovas,
have you tried replacing gui=True
by gui=False
in the environment initialization?
Hi @jr-b-reiterer,
Thank you for the reply. Yes, I replaced gui=True
with gui=False
.
The test.py file in the forked repo I shared above contains the test script I was running.
When I test with your image, the behaviour is different: I get past the lines from your screenshot, but then the reset fails. The warning from gym I get there gave me a hint that you are using a too new version of gym, 0.26. robo-gym in the present version is compatible with gym up to 0.21 only because of their API change. (An upgrade of robo-gym is in the works internally.)
Back to your observation: I am using Docker 20.10.21 on Ubuntu 20.04. I am not sure if any difference here could cause the problem. You could test if it is different when you run your test script not in a tmux pane but in a separate terminal that you connect to your running container in addition:
docker exec -it <container name> bash
Hi @jr-b-reiterer,
I downgraded gym to 0.21, and I am now getting the same error as you. I tried both running docker exec -it <container name> bash
and running the test script in a tmux pane, and in both cases, I am no longer stuck at "Starting new robot server", but get an error at reset.
Do you know how I might fix the reset error? Thanks!
I am not sure it will fix your issue, but apparently your downgrade of gym was not successful. The passive env checker that outputs the warning in your screenshot does not exist in Gym v0.21, see https://github.com/openai/gym/blob/v0.21.0/gym/utils/passive_env_checker.py vs https://github.com/openai/gym/blob/0.26.0/gym/utils/passive_env_checker.py
Update: got it to work with 2 fixes!
--network host
- (Hacky. Please let me know if anyone has a cleaner solution) MacOS didn't support host network mode, so instead I had to map ports specifically for robot server and server manager. What worked for me was docker run --rm -it -p 47000-47100:47000-47100 -p 50100-50200:50100-50200 <image>
.
- Within the container, find 3 instances of find_free_port()
within <robogym_server_modules>/server_manager/server.py
(should be on L69, L75, L78), and give each of these a lower_bound and upper_bound within the range of mapped ports. Make sure they don't overlap, so in my case, I had find_free_port(47000, 47030)
, find_free_port(47030, 47060)
, find_free_port(47060, 47100)
. Now the robogym training script on the host machine can communicate with docker:
Hello developers, thank you for maintaining robo-gym!
I have been having troubles running robo-gym inside a docker container. My goal is to run the robo-gym server side from within the container, and run the robo-gym training script on my host machine. However, I cannot seem to launch robot server inside the container, and the application always stalls on the step
Starting Robot Server...
.I initially thought it to be a docker port problem, but even if I launched both the server and the training script within the same docker container, I still could not launch the server, as shown (
test.py
in the right pane is simply the Random Agent MiR100 Simulation Environment example in README):Steps to reproduce
docker pull szhaovas/robogym_test
.noetic.Dockerfile
at my fork of the robo-gym-robot-servers repo. This dockerfile is similar to your original, except it also installs robo-gym for running the training script.docker run --rm -it szhaovas/robogym_test
.start-server-manager && attach-to-server-manager
.python3 test.py
.My setup
Additional info
self.tmux_srv.new_session
line inside ServerManager.new_session().kill-all-robot-servers
, it returns the error messageerror connecting to /tmp/tmux-0/ServerManager (No such file or directory)
.Thanks in advance!