Open jajawunderbar opened 5 months ago
Hi thanks for reaching out! Which commands are you using and what are the error messages that you get?
docker run -it --rm -p 8000:8000 arpllab/learning_to_fly
Waiting for Tensorboard TensorFlow installation not found - running with reduced feature set.
NOTE: Using experimental fast data loading logic. To disable, pass "--load_fast=false" and report issues on GitHub. More details: https://github.com/tensorflow/tensorboard/issues/4784
TensorBoard 2.12.2 at http://b1d7ad5cd59a:6006/ (Press CTRL+C to quit) Running command: Note: This executable should be executed in the context (working directory) of the main repo e.g. ./build/src/rl_environments_multirotor_ui 0.0.0.0 8000 Web interface coming up at: http://0.0.0.0:8000
but no web interface after that.
This
TensorFlow installation not found - running with reduced feature set.
is no problem and expected. Tensorboard (which is only used to visualize the training logs) tries to load TensorFlow but it is not required. It is just part of the Docker image but not used in this example.
When you go to http://0.0.0.0:8000/ with your web-browser after it shows:
Web interface coming up at: http://0.0.0.0:8000/
is the simulator UI showing up?
If you want to try Tensorboard you can use
docker run -it --rm -p 6006:6006 arpllab/learning_to_fly training_headless
as described in the readme
You should be able to view Tensorboard at http://0.0.0.0:6006
yes, site not reachable.
I just updated my last message with more details. But it is quite weird that it is not reachable for you 🤔
Can you try this to see if it is a problem with the docker setup:
docker run -it --rm -p 8000:8000 python python -m http.server
docker run -it --rm -p 6006:6006 arpllab/learning_to_fly training_headless with this its running.
Step: 0 Saving actor checkpoint "checkpoints/multirotor_td3/2024_02_10_21_52_15_d+o+a+r+h+c+f+w+e+_000/actor_000000000000000.h5" Saving checkpoint at: "checkpoints/multirotor_td3/2024_02_10_21_52_15_d+o+a+r+h+c+f+w+e+_000/actor_000000000000000.h" Step: 0 (mean return: -163.123, mean episode length: 41.089) Step: 10000 Step: 10000 (mean return: -163.071, mean episode length: 41.447) Step: 20000 Step: 20000 (mean return: -163.396, mean episode length: 41.333) Step: 30000 Step: 30000 (mean return: -162.777, mean episode length: 41.496) Step: 40000 Step: 40000 (mean return: -216.718, mean episode length: 39.383) Step: 50000 Step: 50000 (mean return: -200.716, mean episode length: 42.801) ...
docker run -it --rm -p 8000:8000 python python -m http.server
Unable to find image 'python:latest' locally latest: Pulling from library/python 6a299ae9cfd9: Pull complete e08e8703b2fb: Pull complete 68e92d11b04e: Pull complete 5b9fe7fef9be: Pull complete 09864a904dd0: Pull complete a21b4eeffed1: Pull complete 14fa1d442750: Pull complete 4c1733c93f94: Pull complete Digest: sha256:5eda1a3e78a90e7542c221cf525233f4b958a5778e61bcc350cc1e0d2bcf7ecf Status: Downloaded newer image for python:latest Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
but no web page
If you can't see a webpage even with docker run -it --rm -p 8000:8000 python python -m http.server
(as a minimal test case) I think there is some issue with your Docker setup (maybe a firewall or something like that?).
Thank you very much for your help! But no chance so far, checked the firewall ... I dont know whats my fault ... The headless says this (see below) and endet with no errors and a complete message, but there is noch checkpoint directory after that?! ... Step: 80000 (mean return: -179.525, mean episode length: 53.071) Step: 90000 Step: 90000 (mean return: -197.249, mean episode length: 58.506) Step: 100000 Saving actor checkpoint "checkpoints/multirotor_td3/2024_02_11_19_15_38_d+o+a+r+h+c+f+w+e+_000/actor_000000000100000.h5" Saving checkpoint at: "checkpoints/multirotor_td3/2024_02_11_19_15_38_d+o+a+r+h+c+f+w+e+_000/actor_000000000100000.h" Step: 100000 (mean return: -208.098, mean episode length: 59.25) Step: 110000 Step: 110000 (mean return: -182.684, mean episode length: 58.585) ...
For that you should mount a local checkpoint
directory into the container using e.g. -v $(pwd)/checkpoints:/learning_to_fly/checkpoints
.
mkdir checkpoints
docker run -it --rm -p 6006:6006 -v $(pwd)/checkpoints:/learning_to_fly/checkpoints arpllab/learning_to_fly training_headless
This creates checkpoints in the local checkpoint dir for me and I can also observe the Tensorboard logs at http://0.0.0.0:6006
Thanks! Now I have a firmware.
It's not possible do use it in a simulation like webots, right?
Why I get this by using the docker command Its installed by pip and tried docker also. Thanks. Mathias