SciTools / cartopy

Cartopy - a cartographic python library with matplotlib support
https://scitools.org.uk/cartopy/docs/latest
BSD 3-Clause "New" or "Revised" License
1.44k stars 368 forks source link

Clarifying the 'under the hood' data download (and other GIS stuff) #1325

Open jypeter opened 5 years ago

jypeter commented 5 years ago

I have just used ax_plot.add_feature(cfeature.LAND) for the first time, which triggered an external data download

/home/share/unix_files/cdat/miniconda3/envs/cdatm_py2/lib/python2.7/site-packages/cartopy/io/__init__.py:260: DownloadWarning: Downloading: http://naciscdn.org/naturalearth/110m/physical/ne_110m_land.zip
  warnings.warn('Downloading: {}'.format(url), DownloadWarning)

This went quite well, but it reminded me that such a download recently failed when using cartopy on a supercomputer with no outside world access. I was in a hurry and did not write down the error message, so I can't say if it was friendly and informative or not (I figured out what it was about, and a workaround, but you want to make this easy for all users)

Anyway, I think it would be nice to add a page somewhere about:

And you could then add a link to this new page from The cartopy Feature interface page and other relevant places

joernu76 commented 5 years ago

I second this. We plan to use cartopy for a flight planning tool used in remote locations with bad internet connection and want to automatically install it and the necessary data fully before going abroad.

cpatrizio88 commented 5 years ago

@jypeter I just encountered the same issue working on a supercomputer. Did you happen to figure out a workaround for this?

trexfeathers commented 5 years ago

+1 for this. I recently tried to guide a supercomputer user through this issue by debugging and picking apart the various functions. I missed some steps, confusing the situation

So a thorough guide would be very helpful

jypeter commented 5 years ago

I had completely forgotten about that question until you guys asked about it yesterday, and I also had to plot a map on my laptop during a conference and it triggered the download on my laptop (WiFi was fortunately working). I have just tried to find out what was downloaded:

(cdatm_py2) jypeter@lsce4078:~$ ls -ltrRh ~/.local/share/cartopy/shapefiles /home/jypeter/.local/share/cartopy/shapefiles: total 0 drwxrwxrwx 1 jypeter jypeter 512 Jul 29 16:09 natural_earth

/home/jypeter/.local/share/cartopy/shapefiles/natural_earth: total 0 drwxrwxrwx 1 jypeter jypeter 512 Jul 29 16:10 physical

/home/jypeter/.local/share/cartopy/shapefiles/natural_earth/physical: total 96K -rw-rw-rw- 1 jypeter jypeter 88K Jul 29 16:10 ne_110m_coastline.shp -rw-rw-rw- 1 jypeter jypeter 3.7K Jul 29 16:10 ne_110m_coastline.dbf -rw-rw-rw- 1 jypeter jypeter 1.2K Jul 29 16:10 ne_110m_coastline.shx

So the workaround would probably be to copy those files from a computer with network access to the .local directory of all the users who need them on a supercomputer. Can somebody give this a try?

I'd rather have a solution where cartopy first checks if the data files are available in a 'conda installation' centralized location, and then checks the ~/.local directory

@bjlittle any ideas here?

cpatrizio88 commented 5 years ago

@jypeter I have all of the files in ~/.local/share/cartopy/shapefiles/natural_earth/physical

and still no luck unfortunately. I've verified that my cartopy.config['data_dir'] is pointing to that directory as well.

cpatrizio88 commented 5 years ago

1072 is relevant here too. Looks like they had some success with the above method, but it's not working for me...

jypeter commented 5 years ago

@cpatrizio88 maybe you can try the download tool mentioned in Location of stored offline data for cartopy

We probably have to play with this, and possibly use pre_existing_data_dir for having multiple users point to the same data location.

However, when installing cartopy with conda, I'm not too sure how to initialize cleanly pre_existing_data_dir for everybody using this conda install without overwriting source code installed by conda, even after reading the cartopy.config documentation

I still think there should be a documentation page on the cartopy site connecting all the dots for this. Or maybe it is there and I have not found it yet

cpatrizio88 commented 5 years ago

@jypeter thanks for your suggestion. I got this working using the following steps:

  1. Make sure cartopy.config['data_dir'] = '~.local/share/cartopy'

  2. Place the following files in ~.local/share/cartopy/shapefiles/natural_earth/physical/:

ne_110m_coastline.dbf ne_110m_coastline.shp ne_110m_coastline.shx ne_110m_land.dbf ne_110m_land.shp ne_110m_land.shx

And that's all! This will allow m.add_feature(cart.feature.LAND) and m.add_feature(cart.feature.COASTLINE) to work offline. I'm sure the steps are similar for other cartopy features.

Just a note that setting the cartopy.config['data_dir'] to exactly where the shape files are being stored did not work for me. This made the problem more difficult than necessary I think.

Your point about having multiple users point to the same data directory is well taken though. I was just happy to get it working for myself for now.

alpha-beta-soup commented 5 years ago

Just here to say that setting cartopy.config['data_dir'] does work fine for me. I'm doing this with a Docker build, only downloading the 10m coastlines. Here's a snippet of my Dockerfile:

# Download some NaturalEarth data for cartopy
ENV CARTOPY_DIR=/usr/local/cartopy-data
ENV NE_PHYSICAL=${CARTOPY_DIR}/shapefiles/natural_earth/physical
RUN mkdir -p ${NE_PHYSICAL}
RUN wget https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_coastline.zip -P ${CARTOPY_DIR}
RUN apt-get -yq install unzip
RUN unzip ${CARTOPY_DIR}/ne_10m_coastline.zip -d  ${NE_PHYSICAL}
RUN rm ${CARTOPY_DIR}/*.zip

And then in Python:

import os
import cartopy
cartopy.config['data_dir'] = os.getenv('CARTOPY_DIR', cartopy.config.get('data_dir'))

And then the repeated download is avoided (except for the initial one when the image is built).

SpacelySpaceSprockets commented 5 years ago

The above instructions haven't worked for me and this is a problem. I have several big customers who absolutely can't connect to the internet and need mapping features.

They have the maps in their .local/share/cartopy/... directory, I've verified that. I've tried setting the pre_existing_data_dir and that didn't seem to do anything.

And I second jypeter's suggestion that the documentation on this is lacking.

Here's their stack trace, there might be formatting strangeness, I had convert a pdf of a scan to text...

`/opt/anaconda/5.3.0/lib/python3.6/site-packages/cartopy/io/init.py:260: DownloadWarning: Downloading: http://naciscdn.org/naturalearth/110m/physical/ne_110m_ocean.zip warnings.warn('Downloading: {}'.format(url), DownloadWarning) Traceback (most recent call last):

File "/opt/anaconda/5.3.0/lib/python3.6/urllib/request.py", line 1318, in do_open encode_chunked=req.has_header('Transfer-encoding')) File "/opt/anaconda/5.3.0/lib/python3.6/http/client.py", line 1239, in request self._send_reguest(method, url, body, headers, encode_chunked) File "/opt/anaconda/5.3.0/lib/python3.6/http/client.py", line 1285, in send_request self.endheaders(body, encode_chunked=encode_chunked) File "/opt/anaconda/5.3.0/lib/python3.6/http/client.py", line 1234, in endheaders self. send output(message body, encode chunked=encode chunked) File "/opt/anaconda/5.3.0/lib/python3.6/http/client.py" line 1026, in _send_output self.send(msg) File "/opt/anaconda/5.3.0/lib/python3.6/http/client.py" line 964, in send self.connect{) File "/opt/anaconda/5.3.0/lib/python3.6/http/client.py", line 936, in connect (self.host,self.port), self.timeout, self.source_address) File "/opt/anaconda/5.3.0/lib/python3.6/socket.py", line 704, in create connection for res in getaddrinfo(host, port, 0, SOCK_STREAM): File "/opt/anaconda/5.3.0/lib/python3.6/socket.py", line 745, in getaddrinfo for res in socket.getaddrinfo(host, port, family, type, proto, flags): socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/myWidget.py", line 240, in update_detections self.clear_plot(} File "/myWidget.py", line 226, in clear_plot self.fig.canvas.draw()
File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/matplotlib/backends/backend_agg.py", line 437, in draw self.figure.draw(self.renderer) File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/matplotlib/artist.py", line 55, in draw_wrapper return draw{artist, renderer, *args, kwargs) File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/matplotlib/figure.py", line 1493, in draw renderer, self, artists, self.suppressComposite) File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/matplotlib/image.py", line 141, in _draw_list_compositing_images a.draw(renderer) File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/matplotlib/artist.py", line 55, in draw_wrapper return draw(artist, renderer, *args, *kwargs) File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/cartopy/mpl/geoaxes.py", line 385, in draw inframe=inframe) File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/matplotlib/artist.py", line 55, in draw_wrapper return draw(artist, renderer, args, kwargs) File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/matplotlib/axes/_base.py", line 2635, in draw mimage._draw_list_compositing_images(renderer, self, artists) File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/matplotlib/image.py", line 141, in _draw_list_compositing_images a.draw(renderer) File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/matplotlib/artist.py", line 55, in draw_wrapper return draw(artist, renderer, *args, *kwargs) File "/opt/anaconcla/5.3.0/lib/python3.6/site-packages/cartopy/mpl/feature_artist.py", line 137, in draw geoms "'self._feature.intersecting_geometries(extent) File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/cartopy/feature.py", line 120, in intersecting_geometries return (geom for geom in self.geometries() if File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/cartopy/feature.py", line 191, in geometries name=self.name) File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/cartopy/io/shapereader.py", line 265, in natural_earth return ne_downloader.path(format_dict) File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/cartopy/io/init.py", line 222, in path result_path = self.acquire_resource(target_path, format_dict) File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/cartopy/io/shapereader.py”, line 320, in acquire resource shapefile_online = self.urlopen(url) File "/opt/anaconda/5.3.0/lib/python3.6/site-packages/cartopy/io/init.py”, line 261, in _urlopen return urlopen(url) File "/opt/anaconda/5.3.0/lib/python3.6/urllib/request.py", line 223, in urlopen return opener.open(url, data, timeout) File "/opt/anaconda/5. 3. o/lib/python3. 6/urllib/request .py", line 526, in open response·= self._open(req, data) File "/opt/anaconda/5.3.0/lib/python3.6/urllib/request.py", line 544, in _open '_open', req) File "/opt/anaconda/5.3.0/lib/python3.6/urllib/request.py", line 504, in _call_chain result= func(args) File "/opt/anaconda/5.3.0/lib/python3.6/urllib/request.py", line 1346, in http_open return self.do_open(http.client.HTTPConnection, req) File "/opt/anaconda/5.3.0/lib/python3.6/urllib/request.py", line 1320, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [Errno -2] Name or service not known> Abort (core dumped) `

pzsamscore commented 4 years ago

Note that if the 'pre_existing_data_dir' config key value is set in ../cartopy/init.py, then that will be the location cartopy looks for first. if you set that key value for a non-networked machine, then it should prevent attempting to download, as long as the shapefiles cartopy is looking for are copied to that directory. You can of course in-line set it on each script run with cartopy.config['pre_existing_data_dir'] = path. I personally had issues getting this to work with the conda-forge build on windows, documented in #1435. Using Christoph Golhke's build for windows, it works fine. I don't have access to a non-windows machine to test, but given my recent difficulties with this same issue but on Windows, something tells me there's a conda recipe issue somewhere along the way.

Anderson3 commented 4 years ago

Hello guys, today go through the same problem with you in an application deployment that contained Cartopy on a server. I tried to download the .zip file and unzip it in the respective directory (as I had seen in slacks), however I was not successful. As another attempt I accessed the files present in the same directory on my machine (which is a Windows), and the files present in that folder I copied to the directory, and oddly enough it worked for me.

"ne_110m_coastline.dbf, ne_110m_coastline, ne_110m_coastline.shx, ne_110m_land.dbf, ne_110m_land, ne_110m_land.shx, ne_110m_ocean.dbf, ne_110m_ocean e ne_110m_ocean.shx."

So, if you want to test, open the directory in Windows and copy the files to the respective application folder you want, to gain access to the files offline.

kserradell commented 4 years ago

Hello, I had the same issue in a cluster with no outside network connection. The best solution for me is to download data using tools/feature_download.py script, copy it in a folder (lib/python3.7/site-packages/Cartopy-0.18.0-py3.7-linux-x86_64.egg/cartopy/data in my case) and then modify __init__.py adding that folder.

config = {'pre_existing_data_dir': 'YOUR_FOLDER',
          'data_dir': _data_dir,
          'repo_data_dir': os.path.join(os.path.dirname(__file__), 'data'),
          'downloaders': {},
          }

Then all users of the Cartopy package can plot their maps without problems.

georgemccabe commented 3 years ago

We have developed a workaround for supplying the map files for systems that cannot download the files, however it looks like webpage that supplies these images has been down all day: https://naciscdn.org It would be nice if cartopy could be obtained with these files included so we don't have to rely on an internet connection and this webpage being up to get everything we need to run.

Update: I learned of the script that is provided with the cartopy source code to obtain the maps. I also found the new location of the files that are automatically downloaded by cartopy: https://naturalearth.s3.amazonaws.com i.e. https://naturalearth.s3.amazonaws.com/110m_cultural/110m_cultural.zip

Will a patch be issued to cartopy to update these URLs?

acarapetis commented 3 years ago

@georgemccabe Not sure whether this S3 bucket is meant to replace the existing CDN - I think it's just an alternate source. That being said, my bet is always that S3 will be more reliable than just about any other source, so I'm updating my build systems to drop in this custom cartopy config:

_SOURCE_TEMPLATE = 'https://naturalearth.s3.amazonaws.com/{resolution}_{category}/ne_{resolution}_{name}.zip'

def update_config(config):
    """Configures cartopy to download NaturalEarth shapefiles from S3 instead
    of naciscdn."""
    from cartopy.io.shapereader import NEShpDownloader
    target_path_template = NEShpDownloader.default_downloader().target_path_template
    downloader = NEShpDownloader(url_template=_SOURCE_TEMPLATE,
                                 target_path_template=target_path_template)
    config['downloaders'][('shapefiles', 'natural_earth')] = downloader

My deployment method:

usersitedir=$(python -c 'from __future__ import print_function; import site; print(site.getusersitepackages())')
mkdir -p "$usersitedir"/cartopy_userconfig
cp THE_PYTHON_SCRIPT_ABOVE.py "$usersitedir"/cartopy_userconfig/__init__.py
greglucas commented 3 years ago

@acarapetis, I believe it is meant to replace it reading through this issue: https://github.com/nvkelso/natural-earth-vector/issues/445 It looks like it has been approved and merged on the AWS side. https://github.com/awslabs/open-data-registry/pull/853 It would be great if someone wants to submit a PR to update the download scripts to point to the new URL.

ritviksahajpal commented 2 years ago

Thanks for an excellent library! I had a question w.r.t. downloading the ne_shaded vhigh raster. When I run this command python cartopy_feature_download.py physical --output ., I only seem to download shapefiles and not any rasters. How do I get those? They do not seem to be downloaded automatically.

zmoon commented 2 years ago

Related to this issue, the feature download script help is wrong about the default location:

https://github.com/SciTools/cartopy/blob/c184eadc1dfc458fd102e668084667dd9c0efa59/tools/cartopy_feature_download.py#L115-L117

It's the user data dir, not the user cache dir. And even then, it's only the user data dir according to XDG, not (necessarily) according to the OS. That is ok, but an alternative could be to use platformdirs.

In any case, as the OP suggests, it could be useful to document somewhere what cartopy.config['data_dir'] defaults to, so people don't have to look through the code to find out.

ricardobarroslourenco commented 1 year ago

I have a general question (hopefully related to this thread). Is it possible to use the conda-forge offline repository ( https://anaconda.org/conda-forge/cartopy_offlinedata ) to provide the offline data? If so, how can I load it later? I'm a HPC user that is building a docker container to be used as an Apptainer source which then will be run in a cluster with no internet access...