aws-deepracer-community / deepracer-for-cloud

Creates an AWS DeepRacing training environment which can be deployed in the cloud, or locally on Ubuntu Linux, Windows or Mac.
MIT No Attribution
325 stars 175 forks source link

Could not connect to the endpoint URL #108

Closed HasarinduPerera closed 1 year ago

HasarinduPerera commented 1 year ago

Getting a,

fatal error: Could not connect to the endpoint URL: "http://localhost:9000/bucket?list-type=2&prefix=custom_files%2F&encoding-type=url" when running dr-upload-custom-files

AND

botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "http://localhost:9000/bucket/custom_files/reward_function.py"
Creating Robomaker configuration in s3://bucket/rl-deepracer-sagemaker/training_params.yaml
Updating service deepracer-0_rl_coach (id: kjx3z9p2qxdzzddudkr6ul05q)
Updating service deepracer-0_robomaker (id: w3vkg5xmo4wgcd5idljlo1y9y)
Waiting up to 15 seconds for Sagemaker to start up...
Sagemaker is not running.

when running dr-start-training

Already tried changing the docker-compose-local.yml to minio/minio:RELEASE.2022-05-08T23-50-31Z

TIA.

larsll commented 1 year ago

There can be several issues causing this. What does docker ps say?

AbhilashBharadwaj commented 1 year ago

I have the same issue connecting to Azure. The docker ps shows that the container is running, however the errors thrown are as shown below image

HasarinduPerera commented 1 year ago

There can be several issues causing this. What does docker ps say?

It doesn't show any running containers. 😕

larsll commented 1 year ago

I suggest you put your questions forward in the Slack group: https://aws-ml-community.slack.com/ssb/redirect

TParkersJJ commented 1 year ago

Hi larsll, I am running my instance on AWS. I have the same issue as [AbhilashBharadwaj]. May I know how to resolve this issue?

ubuntu@ip-10-0-13-77:~$ docker logs -f bc16bd801a3a 02/04/2023 13:24:09 passing arg to libvncserver: -rfbport 02/04/2023 13:24:09 passing arg to libvncserver: 5900 02/04/2023 13:24:09 x11vnc version: 0.9.13 lastmod: 2011-08-10 pid: 62 02/04/2023 13:24:09 02/04/2023 13:24:09 wait_for_client: WAIT:0 02/04/2023 13:24:09 02/04/2023 13:24:09 initialize_screen: fb_depth/fb_bpp/fb_Bpl 24/32/2560 02/04/2023 13:24:09 02/04/2023 13:24:09 Listening for VNC connections on TCP port 5900 02/04/2023 13:24:09 Listening for VNC connections on TCP6 port 5900 02/04/2023 13:24:09 listen6: bind: Address already in use 02/04/2023 13:24:09 Not listening on IPv6 interface. 02/04/2023 13:24:09

The VNC desktop is: bc16bd801a3a:0 PORT=5900 JWM: warning: /etc/jwm/system.jwmrc[6]: invalid include: /etc/jwm/debian-menu IP: 10.0.0.4 172.18.0.3 10.0.1.6 (bc16bd801a3a) 01:24:11 INFO:[DeepRacerNodeMonitor]: NodeMonitor started running 01:24:13 INFO:[DeepRacerNodeMonitor]: Running nodes are {'/download_params_and_roslaunch_agent_node'} s3 failed, retry after 1.2500052825265975 seconds. Re-try count: 1/5: Could not connect to the endpoint URL: "https://deepracer-letsssssssgo-model-and-result.s3.amazonaws.com/rl-deepracer-sagemaker/training_params.yaml" s3 failed, retry after 4.352610828342877 seconds. Re-try count: 2/5: Could not connect to the endpoint URL: "https://deepracer-letsssssssgo-model-and-result.s3.amazonaws.com/rl-deepracer-sagemaker/training_params.yaml"

larsll commented 1 year ago

@TParkersJJ - I suggest connecting via Slack to https://aws-ml-community.slack.com/ssb/redirect to get support! The channel #dr-local-training is filled with community members that are happy to help getting your instance running!