WongKinYiu / yolov7

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
GNU General Public License v3.0
13.03k stars 4.12k forks source link

error during training model on server -wandb error #275

Closed akashAD98 closed 1 year ago

akashAD98 commented 1 year ago

not able to train my custom object detection model , is there any way to solve this issue?

https://github.com/wandb/local/issues/83

YOLOR 🚀 v0.1-43-g8b72ac7 torch 1.9.0+cu111 CUDA:0 (A100-SXM4-40GB, 40537.1875MB)

Retry attempt failed: Traceback (most recent call last): File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/connection.py", line 174, in _new_conn conn = connection.create_connection( File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/util/connection.py", line 72, in create_connection for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM): File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/socket.py", line 918, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/connectionpool.py", line 386, in _make_request self._validate_conn(conn) File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn conn.connect() File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/connection.py", line 358, in connect self.sock = conn = self._new_conn() File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/connection.py", line 186, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f93b64ab520>: Failed to establish a new connection: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/nlsasfs/home/reflexion/chandnip/.local/lib/python3.8/site-packages/requests/adapters.py", line 489, in send resp = conn.urlopen( File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='api.wandb.ai', port=443): Max retries exceeded with url: /graphql (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f93b64ab520>: Failed to establish a new connection: [Errno -2] Name or service not known'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/wandb/sdk/lib/retry.py", line 108, in call result = self._call_fn(*args, kwargs) File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/wandb/sdk/internal/internal_api.py", line 158, in execute return self.client.execute(*args, *kwargs) File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/client.py", line 52, in execute result = self._get_result(document, args, kwargs) File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/client.py", line 60, in _get_result return self.transport.execute(document, *args, kwargs) File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/transport/requests.py", line 38, in execute request = requests.post(self.url, post_args) File "/nlsasfs/home/reflexion/chandnip/.local/lib/python3.8/site-packages/requests/api.py", line 115, in post return request("post", url, data=data, json=json, kwargs) File "/nlsasfs/home/reflexion/chandnip/.local/lib/python3.8/site-packages/requests/api.py", line 59, in request return session.request(method=method, url=url, kwargs) File "/nlsasfs/home/reflexion/chandnip/.local/lib/python3.8/site-packages/requests/sessions.py", line 587, in request resp = self.send(prep, send_kwargs) File "/nlsasfs/home/reflexion/chandnip/.local/lib/python3.8/site-packages/requests/sessions.py", line 701, in send r = adapter.send(request, kwargs) File "/nlsasfs/home/reflexion/chandnip/.local/lib/python3.8/site-packages/requests/adapters.py", line 565, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.wandb.ai', port=443): Max retries exceeded with url: /graphql (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f93b64ab520>: Failed to establish a new connection: [Errno -2] Name or service not known')) wandb: Network error (ConnectionError), entering retry loop. wandb: W&B API key is configured. Use wandb login --relogin to force relogin wandb: Network error (ConnectionError), entering retry loop.

akashAD98 commented 1 year ago

Kindly use the below proxy ip addrss for wandb connection. inside your .sh file

export http_proxy=http://dgx-proxy-mn.mgmt.siddhi.param:9090/

export ftp_proxy=http://dgx-proxy-mn.mgmt.siddhi.param:9090/

export https_proxy=http://dgx-proxy-mn.mgmt.siddhi.param:9090/