jr-robotics / robo-gym

An open source toolkit for Distributed Deep Reinforcement Learning on real and simulated robots.
https://sites.google.com/view/robo-gym
MIT License
390 stars 74 forks source link

Unable to launch robot server within docker container. #81

Open szhaovas opened 3 weeks ago

szhaovas commented 3 weeks ago

Hello developers, thank you for maintaining robo-gym!

I have been having troubles running robo-gym inside a docker container. My goal is to run the robo-gym server side from within the container, and run the robo-gym training script on my host machine. However, I cannot seem to launch robot server inside the container, and the application always stalls on the step Starting Robot Server....

I initially thought it to be a docker port problem, but even if I launched both the server and the training script within the same docker container, I still could not launch the server, as shown (test.py in the right pane is simply the Random Agent MiR100 Simulation Environment example in README):

Screenshot 2024-06-07 at 15 24 19

Steps to reproduce

My setup

Additional info

Thanks in advance!

jr-b-reiterer commented 4 days ago

Hi @szhaovas,

have you tried replacing gui=True by gui=False in the environment initialization?

szhaovas commented 1 day ago

Hi @jr-b-reiterer,

Thank you for the reply. Yes, I replaced gui=True with gui=False. The test.py file in the forked repo I shared above contains the test script I was running.

jr-b-reiterer commented 1 day ago

When I test with your image, the behaviour is different: I get past the lines from your screenshot, but then the reset fails. The warning from gym I get there gave me a hint that you are using a too new version of gym, 0.26. robo-gym in the present version is compatible with gym up to 0.21 only because of their API change. (An upgrade of robo-gym is in the works internally.)

Back to your observation: I am using Docker 20.10.21 on Ubuntu 20.04. I am not sure if any difference here could cause the problem. You could test if it is different when you run your test script not in a tmux pane but in a separate terminal that you connect to your running container in addition: docker exec -it <container name> bash

szhaovas commented 22 hours ago

Hi @jr-b-reiterer,

I downgraded gym to 0.21, and I am now getting the same error as you. I tried both running docker exec -it <container name> bash and running the test script in a tmux pane, and in both cases, I am no longer stuck at "Starting new robot server", but get an error at reset. Do you know how I might fix the reset error? Thanks!

Screenshot 2024-07-02 at 15 55 56