blaylockbk / Herbie

Download numerical weather prediction datasets (HRRR, RAP, GFS, IFS, etc.) from NOMADS, NODD partners (Amazon, Google, Microsoft), ECMWF open data, and the University of Utah Pando Archive System.
https://herbie.readthedocs.io/
MIT License
471 stars 73 forks source link

Connection reset by peer #350

Open jerrylin96 opened 1 month ago

jerrylin96 commented 1 month ago

Hi Brian,

Thanks for making this super helpful package! Sometimes I'll get this connection reset error that stops the download arbitrarily. Do you know what might be causing it and how to avoid running into this issue?

Best regards,

Jerry

    return request("head", url, **kwargs)
  File "/global/homes/j/jerrylin/.conda/envs/myfolder/lib/python3.9/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/global/homes/j/jerrylin/.conda/envs/myfolder/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/global/homes/j/jerrylin/.conda/envs/myfolder/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/global/homes/j/jerrylin/.conda/envs/myfolder/lib/python3.9/site-packages/requests/adapters.py", line 682, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
blaylockbk commented 1 month ago

Hi @jerrylin96, Glad you like Herbie.

I haven't experienced this error before, that I can remember. Can you give any more details, like how are you using Herbie (what commands are you running, what platform)?

This error looks like it's happening in the requests library. ChatGPT seems to think this is related to unstable network connection, firewall policy, or the remote host is experiencing a high load. It's hard to diagnose

jerrylin96 commented 1 month ago

Hi Brian,

I'm running it on Linux in Perlmutter. I think it might be caused by downloading many files in parallel, but sometimes I get an error when I try to download one file at a time interactively:

---------------------------------------------------------------------------
SSLZeroReturnError                        Traceback (most recent call last)
File ~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:715, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    [714](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:714) # Make the request on the httplib connection object.
--> [715](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:715) httplib_response = self._make_request(
    [716](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:716)     conn,
    [717](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:717)     method,
    [718](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:718)     url,
    [719](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:719)     timeout=timeout_obj,
    [720](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:720)     body=body,
    [721](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:721)     headers=headers,
    [722](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:722)     chunked=chunked,
    [723](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:723) )
    [725](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:725) # If we're going to release the connection in ``finally:``, then
    [726](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:726) # the response doesn't need to know about the connection. Otherwise
    [727](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:727) # it will also try to release it and we'll have a double-release
    [728](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:728) # mess.

File ~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:404, in HTTPConnectionPool._make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    [403](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:403) try:
--> [404](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:404)     self._validate_conn(conn)
    [405](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:405) except (SocketTimeout, BaseSSLError) as e:
    [406](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:406)     # Py2 raises this as a BaseSSLError, Py3 raises it as socket timeout.

File ~/.conda/envs/vayuh/lib/python3.9/site-packages/urllib3/connectionpool.py:1060, in HTTPSConnectionPool._validate_conn(self, conn)
...
--> [698](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/requests/adapters.py:698)         raise SSLError(e, request=request)
    [700](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/requests/adapters.py:700)     raise ConnectionError(e, request=request)
    [702](https://vscode-remote+ssh-002dremote-002bperlmutter-002dp1-002enersc-002egov.vscode-resource.vscode-cdn.net/global/cfs/cdirs/m4334/jerry/wind_forecasting/~/.conda/envs/vayuh/lib/python3.9/site-packages/requests/adapters.py:702) except ClosedPoolError as e:

SSLError: HTTPSConnectionPool(host='pando-rgw01.chpc.utah.edu', port=443): Max retries exceeded with url: /hrrr/sfc/20230728/hrrr.t22z.wrfsfcf24.grib2.idx (Caused by SSLError(SSLZeroReturnError(6, 'TLS/SSL connection has been closed (EOF) (_ssl.c:1133)')))
jerrylin96 commented 1 month ago

This particular error happened when trying to download data from July 28, 2023 with fxx = 24.