tkrajina / srtm.py

Geo elevation data parser for "The Shuttle Radar Topography Mission" data
Apache License 2.0
240 stars 57 forks source link

SRTM data endpoint has moved #48

Closed perllaghu closed 3 years ago

perllaghu commented 3 years ago

Related to #45 - The data endpoint has moved at usgs.gov

http://dds.cr.usgs.gov/srtm/version2_1/SRTM3/Eurasia/N49W007.hgt.zip is returning a "310 - permentant move" code, and moved to https://dds.cr.usgs.gov/srtm/version2_1/SRTM3/Eurasia/N49W007.hgt.zip

This related to the whole web slowly moving to https connections.

The problem is the urllib3 library doesn't handle redirects.

tkrajina commented 3 years ago

Pushed a fix in master now.

perllaghu commented 3 years ago

Possibly still failing - works from the console, fails in a Jupyter notebook

[cross-posted to https://github.com/jupyter/notebook/issues/5632)

Tested in a clean virtual environment, a local notebook, and an jupyter/minimal-notebook docekr image - only the CLI code works

here's my test code:

pip install git+git://github.com/tkrajina/srtm.py.git#egg=srtm.py
pip install ridge_map
python3 -c 'from ridge_map import FontManager, RidgeMap; font = FontManager("https://github.com/google/fonts/blob/master/ofl/uncialantiqua/UncialAntiqua-Regular.ttf?raw=True"); rm = RidgeMap((-122.087116,36.945365,-121.999226,37.023250), font=font.prop); values = rm.get_elevation_data(num_lines=220, elevation_pts=550)'

Runs fine from the command line, fails in a jupyter notebook

Error report is:

---------------------------------------------------------------------------
WantReadError                             Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/urllib3/contrib/pyopenssl.py in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname)
    484             try:
--> 485                 cnx.do_handshake()
    486             except OpenSSL.SSL.WantReadError:

/opt/conda/lib/python3.7/site-packages/OpenSSL/SSL.py in do_handshake(self)
   1933         result = _lib.SSL_do_handshake(self._ssl)
-> 1934         self._raise_ssl_error(self._ssl, result)
   1935 

/opt/conda/lib/python3.7/site-packages/OpenSSL/SSL.py in _raise_ssl_error(self, ssl, result)
   1645         if error == _lib.SSL_ERROR_WANT_READ:
-> 1646             raise WantReadError()
   1647         elif error == _lib.SSL_ERROR_WANT_WRITE:

WantReadError: 

During handling of the above exception, another exception occurred:

timeout                                   Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    375         try:
--> 376             self._validate_conn(conn)
    377         except (SocketTimeout, BaseSSLError) as e:

/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py in _validate_conn(self, conn)
    993         if not getattr(conn, "sock", None):  # AppEngine might not have  `.sock`
--> 994             conn.connect()
    995 

/opt/conda/lib/python3.7/site-packages/urllib3/connection.py in connect(self)
    393             server_hostname=server_hostname,
--> 394             ssl_context=context,
    395         )

/opt/conda/lib/python3.7/site-packages/urllib3/util/ssl_.py in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir, key_password)
    369         if HAS_SNI and server_hostname is not None:
--> 370             return context.wrap_socket(sock, server_hostname=server_hostname)
    371 

/opt/conda/lib/python3.7/site-packages/urllib3/contrib/pyopenssl.py in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname)
    487                 if not util.wait_for_read(sock, sock.gettimeout()):
--> 488                     raise timeout("select timed out")
    489                 continue

timeout: select timed out

During handling of the above exception, another exception occurred:

ReadTimeoutError                          Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    448                     retries=self.max_retries,
--> 449                     timeout=timeout
    450                 )

/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    719             retries = retries.increment(
--> 720                 method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
    721             )

/opt/conda/lib/python3.7/site-packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
    399             if read is False or not self._is_method_retryable(method):
--> 400                 raise six.reraise(type(error), error, _stacktrace)
    401             elif read is not None:

/opt/conda/lib/python3.7/site-packages/urllib3/packages/six.py in reraise(tp, value, tb)
    734                 raise value.with_traceback(tb)
--> 735             raise value
    736         finally:

/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    671                 headers=headers,
--> 672                 chunked=chunked,
    673             )

/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    378             # Py2 raises this as a BaseSSLError, Py3 raises it as socket timeout.
--> 379             self._raise_timeout(err=e, url=url, timeout_value=conn.timeout)
    380             raise

/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py in _raise_timeout(self, err, url, timeout_value)
    330             raise ReadTimeoutError(
--> 331                 self, url, "Read timed out. (read timeout=%s)" % timeout_value
    332             )

ReadTimeoutError: HTTPSConnectionPool(host='dds.cr.usgs.gov', port=443): Read timed out. (read timeout=5)

During handling of the above exception, another exception occurred:

ReadTimeout                               Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/srtm/data.py in retrieve_or_load_file_data(self, file_name)
    134         try:
--> 135             r = mod_requests.get(url, timeout=5)
    136         except mod_requests.exceptions.Timeout:

/opt/conda/lib/python3.7/site-packages/requests/api.py in get(url, params, **kwargs)
     74     kwargs.setdefault('allow_redirects', True)
---> 75     return request('get', url, params=params, **kwargs)
     76 

/opt/conda/lib/python3.7/site-packages/requests/api.py in request(method, url, **kwargs)
     59     with sessions.Session() as session:
---> 60         return session.request(method=method, url=url, **kwargs)
     61 

/opt/conda/lib/python3.7/site-packages/requests/sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    532         send_kwargs.update(settings)
--> 533         resp = self.send(prep, **send_kwargs)
    534 

/opt/conda/lib/python3.7/site-packages/requests/sessions.py in send(self, request, **kwargs)
    645         # Send the request
--> 646         r = adapter.send(request, **kwargs)
    647 

/opt/conda/lib/python3.7/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    528             elif isinstance(e, ReadTimeoutError):
--> 529                 raise ReadTimeout(e, request=request)
    530             else:

ReadTimeout: HTTPSConnectionPool(host='dds.cr.usgs.gov', port=443): Read timed out. (read timeout=5)

During handling of the above exception, another exception occurred:

Exception                                 Traceback (most recent call last)
<ipython-input-5-57d1607d6f55> in <module>
      5 # Get the elevation values for each data point within the polygon
      6 rm = RidgeMap(polygon, font=font.prop)
----> 7 values = rm.get_elevation_data(num_lines=220, elevation_pts=550)
      8 
      9 # Specify the colormap segment for this example - range from deep green to darkbrown

/opt/conda/lib/python3.7/site-packages/ridge_map/ridge_map.py in get_elevation_data(self, num_lines, elevation_pts, viewpoint)
    108             num_lines, elevation_pts = elevation_pts, num_lines
    109         values = self._srtm_data.get_image(
--> 110             (elevation_pts, num_lines), self.lats, self.longs, 5280, mode="array"
    111         )
    112 

/opt/conda/lib/python3.7/site-packages/srtm/data.py in get_image(self, size, latitude_interval, longitude_interval, max_elevation, min_elevation, unknown_color, zero_color, min_color, max_color, mode)
    206                     latitude  = latitude_from  + float(row) / height * (latitude_to  - latitude_from)
    207                     longitude = longitude_from + float(column) / width * (longitude_to - longitude_from)
--> 208                     elevation = self.get_elevation(latitude, longitude)
    209                     array[row,column] = elevation
    210 

/opt/conda/lib/python3.7/site-packages/srtm/data.py in get_elevation(self, latitude, longitude, approximate)
     49 
     50     def get_elevation(self, latitude: float, longitude: float, approximate: bool=False) -> Optional[float]:
---> 51         geo_elevation_file = self.get_file(float(latitude), float(longitude))
     52 
     53         #mod_logging.debug('File for ({0}, {1}) -> {2}'.format(

/opt/conda/lib/python3.7/site-packages/srtm/data.py in get_file(self, latitude, longitude)
     96             return self.files[file_name]
     97         else:
---> 98             data = self.retrieve_or_load_file_data(file_name)
     99             if not data:
    100                 return None

/opt/conda/lib/python3.7/site-packages/srtm/data.py in retrieve_or_load_file_data(self, file_name)
    135             r = mod_requests.get(url, timeout=5)
    136         except mod_requests.exceptions.Timeout:
--> 137             raise Exception('Connection to %s failed (timeout)' % url)
    138         if r.status_code < 200 or 300 <= r.status_code:
    139             raise Exception('Cannot retrieve %s' % url)

Exception: Connection to https://dds.cr.usgs.gov/srtm/version2_1/SRTM3//Eurasia//N49W007.hgt.zip failed (timeout)

[python is 3.7.6, OS is Ubuntu 20.04]

tkrajina commented 3 years ago

But this now looks like a completely different exception:

ReadTimeout: HTTPSConnectionPool(host='dds.cr.usgs.gov', port=443): Read timed out. (read timeout=5)

Are you sure this isn't a temporary network issue?

perllaghu commented 3 years ago

could be... but why does it work from the command line & not in a notebook?

Can you repeat my tests on a different network?

On Sun, 26 Jul 2020, 7:26 pm Tomo Krajina, notifications@github.com wrote:

But this now looks like a completely different exception:

ReadTimeout: HTTPSConnectionPool(host='dds.cr.usgs.gov', port=443): Read timed out. (read timeout=5)

Are you sure this isn't a temporary network issue?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tkrajina/srtm.py/issues/48#issuecomment-664023745, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALJCDCP4ZCUQ75IDKXMJPLR5RYMLANCNFSM4PG66TAQ .

mario947 commented 3 years ago

I've just got the same issue and see same errors in log:

HTTPSConnectionPool(host='dds.cr.usgs.gov', port=443): Read timed out. (read timeout=5)

perllaghu commented 3 years ago

Phew....

So I don't know if the problem is tied up in Jupyter notebook or somewhere else... however it's only arisen in the last few days..

On Sun, 26 Jul 2020, 11:34 pm mario947, notifications@github.com wrote:

I've just got the same issue and see same errors in log:

HTTPSConnectionPool(host='dds.cr.usgs.gov', port=443): Read timed out. (read timeout=5)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tkrajina/srtm.py/issues/48#issuecomment-664048848, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALJCDAMDTMUGD5H2MASU2LR5SVPRANCNFSM4PG66TAQ .

mario947 commented 3 years ago

@perllaghu I don't think it's related to Jupyter as I don't use any. I run my script from console and see the error if it tries to query new data from "dds.cr.usgs.gov". It works with already cached files though. Maybe that is the reason why it works for you in console.

mario947 commented 3 years ago

It seems "dds.cr.usgs.gov" servers started to respond really slow and that exception is the same as https://github.com/tkrajina/srtm.py/issues/45. I've increased timeout and everything started to work again.

perllaghu commented 3 years ago

Interesting....

I think the problem is, as you say, network related.... I'm coming in from Scotland.

Tracepath connection using straight broadband connection:

1?: [LOCALHOST]                      pmtu 1500
 1:  SkyRouter.Home                                       22.891ms 
 1:  SkyRouter.Home                                       22.528ms 
 2:  no reply
 3:  be365.pr2.hobir.isp.sky.com                          33.363ms asymm  4 
 4:  no reply
 5:  no reply
 6:  xe-11-1-5.BR2.NYC4.ALTER.NET                         98.615ms asymm  9 
 7:  0.ae1.GW7.MSP3.ALTER.NET                            125.258ms asymm 13 
 8:  usgs-gw.customer.alter.net                          133.584ms asymm 13 
 9:  152.61.101.59                                       132.672ms asymm 14 
10:  no reply
11:  no reply
12:  no reply
13:  no reply
14:  no reply
15:  no reply
16:  no reply
17:  no reply
18:  no reply
19:  no reply
20:  no reply
21:  no reply
22:  no reply
23:  no reply
24:  no reply
25:  no reply
26:  no reply
27:  no reply
28:  no reply
29:  no reply
30:  no reply
     Too many hops: pmtu 1500
     Resume: pmtu 1500 

..... and the same via a VPN through my University:

 1?: [LOCALHOST]                      pmtu 1500
 1:  _gateway                                             34.356ms 
 1:  _gateway                                             35.285ms 
 2:  129.215.128.253                                      64.555ms 
 3:  no reply
 4:  no reply
 5:  10.66.4.5                                            43.359ms 
 6:  192.41.103.211                                       61.928ms 
 7:  ae1.edinat-rbr1.aj.net                               63.054ms asymm  8 
 8:  ae29.glasss-sbr1.ja.net                              64.068ms asymm  9 
 9:  ae31.manckh-sbr2.ja.net                              71.894ms asymm 10 
10:  ae29.erdiss-sbr2.ja.net                              69.977ms asymm 11 
11:  ae31.londpg-sbr2.ja.net                              74.090ms asymm 12 
12:  ae29.londhx-sbr1.ja.net                              74.810ms asymm 13 
13:  janet.mx1.lon.uk.geant.net                           74.656ms asymm 14 
14:  internet2-gw.mx1.lon.uk.geant.net                   159.114ms asymm 15 
15:  ae-0.4079.rtsw2.ashb.net.internet2.edu              138.059ms asymm 16 
16:  ae-2.4079.rtsw.ashb.net.internet2.edu               133.818ms asymm 17 
17:  ae-20.4079.rtsw.clev.net.internet2.edu              138.145ms asymm 18 
18:  ae-3.4079.rtsw3.eqch.net.internet2.edu              179.370ms asymm 19 
19:  ae-5.4079.rtsw.eqch.net.internet2.edu               148.930ms asymm 20 
20:  ae-0.4079.rtsw.star.net.internet2.edu               169.228ms asymm 21 
21:  152.61.101.185                                      138.671ms asymm 22 
22:  152.61.101.129                                      151.938ms asymm 23 
23:  no reply
24:  no reply
25:  no reply
26:  no reply
27:  no reply
28:  no reply
29:  no reply
30:  no reply
     Too many hops: pmtu 1500
     Resume: pmtu 1500

Both seem to die hitting 152.61.101.?? - which whois reports as "United States Geological Survey - EROS Data Center (EDC-14)" .... so, outwith your control.

Do we know if the data's replicated elsewhere? Would there be a reasonable mod to the codebase to have SRTM1_URL & SRTM3_URL as settable variables (and if so, give a documented example?) [I realise this may be impractical, if different systems provide data in different files/APIs/etc]

..... either way - thank you for the time & effort in looking into this problem

tkrajina commented 3 years ago

Thank you both for helping figure out the problem. I added a configurable timeout in master now. And also updated the default timeout to 15s.

As for configurable SRTM URLs. Of course it's possible, but it's not top of my priorities for now. (Of course, feel free to send pull requests).