Open valentasgruzauskas opened 2 years ago
What was the concrete error message for the URL could not be found when you accessed http://localhost:6006/ inside the Docker container?
What was the concrete error message for the URL could not be found when you accessed http://localhost:6006/ inside the Docker container?
No, this message is from the browser. when using curl inside the container directly with tensorboard the error was similar.
What was the concrete error message for the URL could not be found when you accessed http://localhost:6006/ inside the Docker container?
No, this message is from the browser. when using curl inside the container directly with tensorboard the error was similar.
Thanks for the information! I haven't used the tensorboard under the Tune tab, but I'm also using Docker as runtime environment, and I've come across a few tensorboard issues before. If I were you, I might consider checking the following cases:
Has the tensorboard service already been started? Inside Docker container, run one of the following cmds:
ss -tupln | grep 6006
ps -ef | grep tensorboard
curl -v "http://localhost:6006"
and see the reason of failure. If it is caused by Connection refuesed, then it's likely that the service is not runningIf the tensorboard is running, which local address is it listening to?
ss -tupln | grep 6006
(to see the local address of the listening socket) and ping localhost
(to see the name resolution of localhost)
If the Local Address is not a wildcard address or does not match what localhost resolved to, then the tensorboard couldn't be reached through localhost
If the tensorboard is not running, I'd suggest start it manually first with the --bind_all
option.
Good day,
I started docker with docker-compose run --rm train bash, and -d mode for tensorboard did not work, thus I entered again to the same container with docker ps and docker exec -it docker-id bash.
Than I launched the following commands as you indicated:
ss -tupln | grep 6006:
returned nothing (port is not used)
ps -ef | grep tensorboard:
root 369 7 0 12:46 pts/0 00:00:00 grep --color=auto tensorboard
curl -v "http://localhost:6006"
Rebuilt URL to: http://localhost:6006/ Trying 127.0.0.1... connect to 127.0.0.1 port 6006 failed: Connection refused Trying ::1... Immediate connect fail for ::1: Cannot assign requested address Trying ::1... Immediate connect fail for ::1: Cannot assign requested address Failed to connect to localhost port 6006: Connection refused Closing connection 0 curl: (7) Failed to connect to localhost port 6006: Connection refused
tensorboard --bind_all --port 6006 --logdir ./ray_results/:
TensorFlow installation not found - running with reduced feature set.
NOTE: Using experimental fast data loading logic. To disable, pass "--load_fast=false" and report issues on GitHub. More details: https://github.com/tensorflow/tensorboard/issues/4784
TensorBoard 2.9.1 at http://fcc2c67e5325:6006/ (Press CTRL+C to quit)
curl -v "http://localhost:6006" Returns HTML page
Good day,
I started docker with docker-compose run --rm train bash, and -d mode for tensorboard did not work, thus I entered again to the same container with docker ps and docker exec -it docker-id bash.
Than I launched the following commands as you indicated:
ss -tupln | grep 6006:
returned nothing (port is not used)
ps -ef | grep tensorboard:
root 369 7 0 12:46 pts/0 00:00:00 grep --color=auto tensorboard
curl -v "http://localhost:6006"
Rebuilt URL to: http://localhost:6006/ Trying 127.0.0.1... connect to 127.0.0.1 port 6006 failed: Connection refused Trying ::1... Immediate connect fail for ::1: Cannot assign requested address Trying ::1... Immediate connect fail for ::1: Cannot assign requested address Failed to connect to localhost port 6006: Connection refused Closing connection 0 curl: (7) Failed to connect to localhost port 6006: Connection refused
tensorboard --bind_all --port 60006 --logdir ./ray_results/:
TensorFlow installation not found - running with reduced feature set. NOTE: Using experimental fast data loading logic. To disable, pass "--load_fast=false" and report issues on GitHub. More details: tensorflow/tensorboard#4784 TensorBoard 2.9.1 at http://fcc2c67e5325:60006/ (Press CTRL+C to quit)
curl -v "http://localhost:6006"
Rebuilt URL to: http://localhost:6006/ Trying 127.0.0.1... connect to 127.0.0.1 port 6006 failed: Connection refused Trying ::1... Immediate connect fail for ::1: Cannot assign requested address Trying ::1... Immediate connect fail for ::1: Cannot assign requested address Failed to connect to localhost port 6006: Connection refused Closing connection 0 curl: (7) Failed to connect to localhost port 6006: Connection refused
Thanks for the detailed reply.
It seems you started tensorboard on port 60006, with the following test on 6006 which was not consistent. Could you please try _tensorboard --bind_all --port 6006 --logdir ./rayresults/ instead, and then curl or visit tensorboard from the browser to see if it works?
Thank you, I fixed the issue. Now it returns the html page. However, I cannot reach the webpage from my host computer, and Tune dashboard also does not reach it, however, the error message changed to "localhost refused to connect.".
Could you please try curl -v "http://localhost:6006/" on your host machine and share the result?
Could you please try curl -v "http://localhost:6006/" on your host machine and share the result?
Result on host:
Result inside docker:
Could you please try curl -v "http://localhost:6006/" on your host machine and share the result?
Result on host:
Result inside docker:
Thank you. Was the Docker container and its tensorboard process still running when you tested it on your host machine?
I did not close the docker container during the testing, so I think it was running. I am still using tune at the moment and did not stop the container. I tried to use the ray tune dashboard with tensorboard, maybe previous processes are related to those actions?
I used the following command to check active tensorboard processes:
ps -ef|grep tensorboard
Good day. Have this problem been solved yet? If not, could you run sudo ps -ef|grep tensorboard
on the host machine (outside Docker container)?
Good day,
apologies for the late reply, I missed your comment. I launched tensorboard from docker and wrote your command in the host machine.
Output:
When I run curl command from host machine, I still receive error.
@valentasgruzauskas is this issue fixed?
In Ray 2.0, we made the previous experimental dashboard the new default dashboard. Currently, we don't support the tune tab or Tensorboard in the new dashboard. We may also deprecate the existing dashboard in the near future to reduce maintenance burden.
Our assumptions are: there are already good OSS tools for experiment tracking/visualizations like Tensorboard, ml flow, etc. It should be easy for you to host one yourself.
Please feel free to disagree and let us know what you think. We'll add them to the backlog.
What happened + What you expected to happen
I am running a gradient boosting algorithm with tune, TuneSearchCV.
I have initianalized the ray dashboard with:
I can access the Tune tab, however, when I provide the path to the results log, e.g. ./ray_results/_Trainable_2022-07-06_12-24-58/ and select tensorboard, I receive the following error:
In docker-compose.yml I have indicated the port mapping:
For the ray dashboard, I have dashboard_host="0.0.0.0", thus I can access the dashboard outside docker. I tried to access tensorboard inside docker, however, the URL could not be found.
Versions / Dependencies
Operating system: Arxchlinux
Dependencies: ray[default]==1.12.1 gpustat tensorboard tabulate
Reproduction script
Working on it
Issue Severity
No response