geocompx / geocompy

Geocomputation with Python: an open source book
https://py.geocompx.org/
Other
304 stars 57 forks source link

Section 7.2 invalid download link #272

Closed zehuiyin closed 4 weeks ago

zehuiyin commented 1 month ago

In section 7.2 (https://py.geocompx.org/07-read-write#sec-retrieving-open-data), using the link shown in the book results in HTTP Error 500: Internal Server Error.

Screenshot 2024-11-01 222337

I believe the download link for the world airports layer zip file has changed to https://naciscdn.org/naturalearth/10m/cultural/ne_10m_airports.zip.

Additionally, I am curious about the inclusion of the User-agent header in this section. I successfully downloaded the zip file using the new link without any header configuration. The data can be simply downloaded using the following code:

# %%
import urllib.request
import zipfile
# %%
filename = "output/ne_10m_airports.zip"

urllib.request.urlretrieve("https://naciscdn.org/naturalearth/10m/cultural/ne_10m_airports.zip", 
                           filename)

f = zipfile.ZipFile(filename, 'r')
f.extractall('output')
f.close()
Robinlovelace commented 4 weeks ago

Thanks for reporting this @zehuiyin. Do you know of an alternative URL with the same data?

zehuiyin commented 4 weeks ago

Thanks for the quick response! Yes, the link https://naciscdn.org/naturalearth/10m/cultural/ne_10m_airports.zip downloads the same data, as shown in the code snippet in my previous comment. Additionally, it doesn't require any header configuration to download from this alternative URL.

Nowosad commented 4 weeks ago

The url of this data on the https://www.naturalearthdata.com/downloads/10m-cultural-vectors/airports/ website is still https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_airports.zip... Strange

zehuiyin commented 4 weeks ago

If you copy paste the link https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_airports.zip to the browser, it will show an error message as below.

Screenshot 2024-11-02 102929

However, if you click that download link on that website (https://www.naturalearthdata.com/downloads/10m-cultural-vectors/airports/), it will successfully download the data. I checked the download link from Chrome's download history. It shows in fact that the download is from this link https://naciscdn.org/naturalearth/10m/cultural/ne_10m_airports.zip instead.

Screenshot 2024-11-02 103334

michaeldorman commented 4 weeks ago

Thanks @zehuiyin , @Nowosad , and @Robinlovelace !

This alternative code (as suggested by @zehuiyin) works:

import urllib.request
import zipfile

# Set URL+filename
url = 'https://naciscdn.org/naturalearth/10m/cultural/ne_10m_airports.zip'
filename = 'output/ne_10m_airports.zip'
# Download
urllib.request.urlretrieve(url, filename)
# Extract
f = zipfile.ZipFile(filename, 'r')
f.extractall('output')
f.close()

I suggest we make the change in the book

michaeldorman commented 4 weeks ago

https://github.com/geocompx/geocompy/commit/12a256653757fbe31bd3d3a75f7f8b06b9410f04

Nowosad commented 4 weeks ago

Thanks @zehuiyin for letting us know, and @michaeldorman for updating the book