arplaboratory / learning-to-fly

Training transferable end-to-end quadrotor control policies on a laptop in 18 seconds.
MIT License
364 stars 34 forks source link

Tensorflow not found #5

Open jajawunderbar opened 5 months ago

jajawunderbar commented 5 months ago

Why I get this by using the docker command Its installed by pip and tried docker also. Thanks. Mathias

jonas-eschmann commented 5 months ago

Hi thanks for reaching out! Which commands are you using and what are the error messages that you get?

jajawunderbar commented 5 months ago

docker run -it --rm -p 8000:8000 arpllab/learning_to_fly

Waiting for Tensorboard TensorFlow installation not found - running with reduced feature set.

NOTE: Using experimental fast data loading logic. To disable, pass "--load_fast=false" and report issues on GitHub. More details: https://github.com/tensorflow/tensorboard/issues/4784

TensorBoard 2.12.2 at http://b1d7ad5cd59a:6006/ (Press CTRL+C to quit) Running command: Note: This executable should be executed in the context (working directory) of the main repo e.g. ./build/src/rl_environments_multirotor_ui 0.0.0.0 8000 Web interface coming up at: http://0.0.0.0:8000

but no web interface after that.

jonas-eschmann commented 5 months ago

This

TensorFlow installation not found - running with reduced feature set.

is no problem and expected. Tensorboard (which is only used to visualize the training logs) tries to load TensorFlow but it is not required. It is just part of the Docker image but not used in this example.

When you go to http://0.0.0.0:8000/ with your web-browser after it shows:

Web interface coming up at: http://0.0.0.0:8000/

is the simulator UI showing up?

If you want to try Tensorboard you can use

docker run -it --rm -p 6006:6006 arpllab/learning_to_fly training_headless

as described in the readme

You should be able to view Tensorboard at http://0.0.0.0:6006

jajawunderbar commented 5 months ago

yes, site not reachable.

jonas-eschmann commented 5 months ago

I just updated my last message with more details. But it is quite weird that it is not reachable for you 🤔

Can you try this to see if it is a problem with the docker setup:

docker run -it --rm -p 8000:8000 python python -m http.server
jajawunderbar commented 5 months ago

docker run -it --rm -p 6006:6006 arpllab/learning_to_fly training_headless with this its running.

Step: 0 Saving actor checkpoint "checkpoints/multirotor_td3/2024_02_10_21_52_15_d+o+a+r+h+c+f+w+e+_000/actor_000000000000000.h5" Saving checkpoint at: "checkpoints/multirotor_td3/2024_02_10_21_52_15_d+o+a+r+h+c+f+w+e+_000/actor_000000000000000.h" Step: 0 (mean return: -163.123, mean episode length: 41.089) Step: 10000 Step: 10000 (mean return: -163.071, mean episode length: 41.447) Step: 20000 Step: 20000 (mean return: -163.396, mean episode length: 41.333) Step: 30000 Step: 30000 (mean return: -162.777, mean episode length: 41.496) Step: 40000 Step: 40000 (mean return: -216.718, mean episode length: 39.383) Step: 50000 Step: 50000 (mean return: -200.716, mean episode length: 42.801) ...

docker run -it --rm -p 8000:8000 python python -m http.server

Unable to find image 'python:latest' locally latest: Pulling from library/python 6a299ae9cfd9: Pull complete e08e8703b2fb: Pull complete 68e92d11b04e: Pull complete 5b9fe7fef9be: Pull complete 09864a904dd0: Pull complete a21b4eeffed1: Pull complete 14fa1d442750: Pull complete 4c1733c93f94: Pull complete Digest: sha256:5eda1a3e78a90e7542c221cf525233f4b958a5778e61bcc350cc1e0d2bcf7ecf Status: Downloaded newer image for python:latest Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...

but no web page

jonas-eschmann commented 5 months ago

If you can't see a webpage even with docker run -it --rm -p 8000:8000 python python -m http.server (as a minimal test case) I think there is some issue with your Docker setup (maybe a firewall or something like that?).

jajawunderbar commented 5 months ago

Thank you very much for your help! But no chance so far, checked the firewall ... I dont know whats my fault ... The headless says this (see below) and endet with no errors and a complete message, but there is noch checkpoint directory after that?! ... Step: 80000 (mean return: -179.525, mean episode length: 53.071) Step: 90000 Step: 90000 (mean return: -197.249, mean episode length: 58.506) Step: 100000 Saving actor checkpoint "checkpoints/multirotor_td3/2024_02_11_19_15_38_d+o+a+r+h+c+f+w+e+_000/actor_000000000100000.h5" Saving checkpoint at: "checkpoints/multirotor_td3/2024_02_11_19_15_38_d+o+a+r+h+c+f+w+e+_000/actor_000000000100000.h" Step: 100000 (mean return: -208.098, mean episode length: 59.25) Step: 110000 Step: 110000 (mean return: -182.684, mean episode length: 58.585) ...

jonas-eschmann commented 5 months ago

For that you should mount a local checkpoint directory into the container using e.g. -v $(pwd)/checkpoints:/learning_to_fly/checkpoints.

mkdir checkpoints
docker run -it --rm -p 6006:6006 -v $(pwd)/checkpoints:/learning_to_fly/checkpoints arpllab/learning_to_fly training_headless

This creates checkpoints in the local checkpoint dir for me and I can also observe the Tensorboard logs at http://0.0.0.0:6006

jajawunderbar commented 5 months ago

Thanks! Now I have a firmware.

It's not possible do use it in a simulation like webots, right?