Open P-Tse opened 3 years ago
I can replicate this. It is only an issue with the BasicReader, so the simple solution is to pip/conda install Fiona
pip install fiona
I'm guessing this probably came in with https://github.com/SciTools/cartopy/pull/1653, and the yields may be coming from the same generator somewhere within pyshp. Adding in zip(shp.records(), shp.geometries(), shp.records())
produces a third of the results as expected in that case.
pinging @karimbahgat because you may have a better idea on the proper fix for this and whether it requires something on pyshp's side or Cartopy's.
I'll just toss out here (don't have a chance to dig in at the moment), but that would also imply that incorrect tuples are coming out of that loop as well then (record[i], state[i+1])
rather than the desired (record[i], state[i])
.
I took a quick look and I see what's happening here. First I downloaded the 50m NE states file and checked that the external pyshp lib is reading it correctly, with .records, .shapes, .iterRecords, .iterShapes, and .iterShapeRecords all retrieving the correct 100 entries.
Previously, cartopy's .records()
retrieves both attributes and geometries using pyshp's shapeRecord()
one at a time, while .geometries()
only retrieves geometries using pyshp's shape()
one at a time. What happened in #1653 is that we switched from loading these one at a time to using the pyshp's streaming versions iterShapeRecord
and iterShapes
. Since both iterShapeRecord
and iterShapes
yields from the .shp file and the user is iterating them at the same time inside the zip, it ends up consuming the .shp file twice as fast. This is also why zip(shp.records(), shp.geometries(), shp.records())
results in only a third of the results.
This points to the fact that the issue is a result of incorrect usage, since zipping cartopy's .records()
and .geometries()
is redundant. The correct usage would be only calling .records()
since this yields Record objects that already contain the geometry as well as the attributes; .geometries()
is only for cases when only geometries and not attributes are needed. Perhaps in a future version .records()
could be renamed to .features()
to avoid the confusion? So in practice this shouldn't be a very common problem, but it's definitely unexpected behavior and it would be best to cover such edge cases.
In terms of a fix, I'll look into improving pyshp's iterators so that they remember their current progress rather than assume it hasn't been tampered during the iteration. Until that time, on cartopy's side the simplest way would probably be to switch back to the step-wise shapeRecord()
and shape()
calls, which should be fine. Let me know what y'all think, and I can submit the PR for this?
@dopplershift I'm not sure why there would be an offset of 1 for the geometries, could you expand? Does it have to do with the example code i
starting at 1 instead of the usual 0?
Description
When I run the following code, only half of the entries are iterated through in the loop (in this case i will print up to and including 50, when there are len(list(shp.records()))=100). The behaviour is similar with other shape files - terminating at (len(list(shp.records()))+1)//2
If I add list() around either shp.records() or shp.geometries(), then the loop will print to i=100.
The bug is not present in v0.18.0. When I run the code below, without modifications, it prints to i = 100.
Code to reproduce
Traceback
Full environment definition
### Operating system Windows 10 Pro Build 19042 Python 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] Anaconda distribution, Spyder ### Cartopy version 0.19.0.post1 ### conda list ``` # packages in environment at C:\ProgramData\Anaconda3: # # Name Version Build Channel _ipyw_jlab_nb_ext_conf 0.1.0 py38_0 alabaster 0.7.12 py_0 anaconda 2020.11 py38_0 anaconda-client 1.7.2 py38_0 anaconda-navigator 2.0.4 py38_0 anaconda-project 0.8.4 py_0 argh 0.26.2 py38_0 argon2-cffi 20.1.0 py38he774522_1 asn1crypto 1.4.0 py_0 astroid 2.4.2 py38_0 astropy 4.0.2 py38he774522_0 async_generator 1.10 py_0 atomicwrites 1.4.0 py_0 attrs 20.3.0 pyhd3eb1b0_0 autopep8 1.5.4 py_0 babel 2.8.1 pyhd3eb1b0_0 backcall 0.2.0 py_0 backports 1.0 py_2 backports.functools_lru_cache 1.6.4 pyhd3eb1b0_0 backports.shutil_get_terminal_size 1.0.0 py38_2 backports.tempfile 1.0 pyhd3eb1b0_1 backports.weakref 1.0.post1 py_1 bcrypt 3.2.0 py38he774522_0 beautifulsoup4 4.9.3 pyhb0f4dca_0 bitarray 1.6.1 py38h2bbff1b_0 bkcharts 0.2 py38_0 blas 1.0 mkl bleach 3.2.1 py_0 blosc 1.20.1 h7bd577a_0 bokeh 2.2.3 py38_0 boto 2.49.0 py38_0 bottleneck 1.3.2 py38h2a96729_1 brotlipy 0.7.0 py38he774522_1000 bzip2 1.0.8 he774522_0 ca-certificates 2020.10.14 0 cartopy 0.19.0.post1 py38hd4bff75_0 conda-forge certifi 2020.6.20 pyhd3eb1b0_3 cffi 1.14.3 py38h7a1dbc1_0 chardet 3.0.4 py38_1003 click 7.1.2 py_0 cloudpickle 1.6.0 py_0 clyent 1.2.2 py38_1 colorama 0.4.4 py_0 comtypes 1.1.7 py38_1001 conda 4.10.3 py38haa244fe_0 conda-forge conda-build 3.20.5 py38_1 conda-content-trust 0.1.1 pyhd3eb1b0_0 conda-env 2.6.0 1 conda-package-handling 1.7.3 py38h8cc25b3_1 conda-repo-cli 1.0.4 pyhd3eb1b0_0 conda-token 0.3.0 pyhd3eb1b0_0 conda-verify 3.4.2 py_1 console_shortcut 0.1.1 4 contextlib2 0.6.0.post1 py_0 cryptography 3.1.1 py38h7a1dbc1_0 curl 7.71.1 h2a8f88b_1 cycler 0.10.0 py38_0 cython 0.29.21 py38ha925a31_0 cytoolz 0.11.0 py38he774522_0 dask 2.30.0 py_0 dask-core 2.30.0 py_0 decorator 4.4.2 py_0 defusedxml 0.6.0 py_0 descartes 1.1.0 py_4 diff-match-patch 20200713 py_0 distributed 2.30.1 py38haa95532_0 docutils 0.16 py38_1 entrypoints 0.3 py38_0 et_xmlfile 1.0.1 py_1001 fastcache 1.1.0 py38he774522_0 filelock 3.0.12 py_0 flake8 3.8.4 py_0 flask 1.1.2 py_0 freetype 2.10.4 hd328e21_0 fsspec 0.8.3 py_0 future 0.18.2 py38_1 geos 3.9.1 h39d44d4_2 conda-forge get_terminal_size 1.0.0 h38e98db_0 gevent 20.9.0 py38he774522_0 glob2 0.7 py_0 greenlet 0.4.17 py38he774522_0 h5py 2.10.0 py38h5e291fa_0 hdf5 1.10.4 h7ebc959_0 heapdict 1.0.1 py_0 html5lib 1.1 py_0 icc_rt 2019.0.0 h0cc432a_1 icu 58.2 ha925a31_3 idna 2.10 py_0 imageio 2.9.0 py_0 imagesize 1.2.0 py_0 importlib-metadata 2.0.0 py_1 importlib_metadata 2.0.0 1 iniconfig 1.1.1 py_0 intel-openmp 2020.2 254 intervaltree 3.1.0 py_0 ipykernel 5.3.4 py38h5ca1d4c_0 ipython 7.19.0 py38hd4e2768_0 ipython_genutils 0.2.0 py38_0 ipywidgets 7.5.1 py_1 isort 5.6.4 py_0 itsdangerous 1.1.0 py_0 jdcal 1.4.1 py_0 jedi 0.17.1 py38_0 jinja2 2.11.2 py_0 joblib 0.17.0 py_0 jpeg 9b hb83a4c4_2 json5 0.9.5 py_0 jsonschema 3.2.0 py_2 jupyter 1.0.0 py38_7 jupyter_client 6.1.7 py_0 jupyter_console 6.2.0 py_0 jupyter_core 4.6.3 py38_0 jupyterlab 2.2.6 py_0 jupyterlab_pygments 0.1.2 py_0 jupyterlab_server 1.2.0 py_0 keyring 21.4.0 py38_1 kiwisolver 1.3.0 py38hd77b12b_0 krb5 1.18.2 hc04afaa_0 lazy-object-proxy 1.4.3 py38he774522_0 libarchive 3.4.2 h5e25573_0 libcurl 7.71.1 h2a8f88b_1 libiconv 1.15 h1df5818_7 liblief 0.10.1 ha925a31_0 libpng 1.6.37 h2a8f88b_0 libsodium 1.0.18 h62dcd97_0 libspatialindex 1.9.3 h33f27b4_0 libssh2 1.9.0 h7a1dbc1_1 libtiff 4.1.0 h56a325e_1 libxml2 2.9.10 hb89e7f3_3 libxslt 1.1.34 he774522_0 llvmlite 0.34.0 py38h1a82afc_4 locket 0.2.0 py38_1 lxml 4.6.1 py38h1350720_0 lz4-c 1.9.2 hf4a77e7_3 lzo 2.10 he774522_2 m2w64-gcc-libgfortran 5.3.0 6 m2w64-gcc-libs 5.3.0 7 m2w64-gcc-libs-core 5.3.0 7 m2w64-gmp 6.1.0 2 m2w64-libwinpthread-git 5.0.0.4634.697f757 2 markupsafe 1.1.1 py38he774522_0 matplotlib 3.3.2 0 matplotlib-base 3.3.2 py38hba9282a_0 mccabe 0.6.1 py38_1 menuinst 1.4.16 py38he774522_1 mistune 0.8.4 py38he774522_1000 mkl 2020.2 256 mkl-service 2.3.0 py38hb782905_0 mkl_fft 1.2.0 py38h45dec08_0 mkl_random 1.1.1 py38h47e9c7a_0 mock 4.0.2 py_0 more-itertools 8.6.0 pyhd3eb1b0_0 mpmath 1.1.0 py38_0 msgpack-python 1.0.0 py38h74a9793_1 msys2-conda-epoch 20160418 1 multipledispatch 0.6.0 py38_0 navigator-updater 0.2.1 py38_0 nbclient 0.5.1 py_0 nbconvert 6.0.7 py38_0 nbformat 5.0.8 py_0 nest-asyncio 1.4.2 pyhd3eb1b0_0 networkx 2.5 py_0 nltk 3.5 py_0 nose 1.3.7 py38_2 notebook 6.1.4 py38_0 numba 0.51.2 py38hf9181ef_1 numexpr 2.7.1 py38h25d0782_0 numpy 1.19.2 py38hadc3359_0 numpy-base 1.19.2 py38ha3acd2a_0 numpydoc 1.1.0 pyhd3eb1b0_1 olefile 0.46 py_0 openpyxl 3.0.5 py_0 openssl 1.1.1h he774522_0 packaging 20.4 py_0 pandas 1.1.3 py38ha925a31_0 pandoc 2.11 h9490d1a_0 pandocfilters 1.4.3 py38haa95532_1 paramiko 2.7.2 py_0 parso 0.7.0 py_0 partd 1.1.0 py_0 path 15.0.0 py38_0 path.py 12.5.0 0 pathlib2 2.3.5 py38_0 pathtools 0.1.2 py_1 patsy 0.5.1 py38_0 pep8 1.7.1 py38_0 pexpect 4.8.0 py38_0 pickleshare 0.7.5 py38_1000 pillow 8.0.1 py38h4fa10fc_0 pip 20.2.4 py38haa95532_0 pkginfo 1.6.1 py38haa95532_0 pluggy 0.13.1 py38_0 ply 3.11 py38_0 portaudio 19.6.0 he774522_4 powershell_shortcut 0.0.1 3 proj 7.2.0 h3e70539_0 conda-forge prometheus_client 0.8.0 py_0 prompt-toolkit 3.0.8 py_0 prompt_toolkit 3.0.8 0 psutil 5.7.2 py38he774522_0 py 1.9.0 py_0 py-lief 0.10.1 py38ha925a31_0 pycodestyle 2.6.0 py_0 pycosat 0.6.3 py38he774522_0 pycparser 2.20 py_2 pycurl 7.43.0.6 py38h7a1dbc1_0 pydocstyle 5.1.1 py_0 pyflakes 2.2.0 py_0 pygments 2.7.2 pyhd3eb1b0_0 pylint 2.6.0 py38_0 pynacl 1.4.0 py38h62dcd97_1 pyodbc 4.0.30 py38ha925a31_0 pyopenssl 19.1.0 py_1 pyparsing 2.4.7 py_0 pyqt 5.9.2 py38ha925a31_4 pyreadline 2.1 py38_1 pyrsistent 0.17.3 py38he774522_0 pyshp 2.1.3 pyh44b312d_0 conda-forge pysocks 1.7.1 py38_0 pytables 3.6.1 py38ha5be198_0 pytest 6.1.1 py38_0 python 3.8.5 h5fd99cc_1 python-dateutil 2.8.1 py_0 python-jsonrpc-server 0.4.0 py_0 python-language-server 0.35.1 py_0 python-libarchive-c 2.9 py_0 python-sounddevice 0.4.1 pyh9f0ad1d_0 conda-forge python_abi 3.8 2_cp38 conda-forge pytz 2020.1 py_0 pywavelets 1.1.1 py38he774522_2 pywin32 227 py38he774522_1 pywin32-ctypes 0.2.0 py38_1000 pywinpty 0.5.7 py38_0 pyyaml 5.3.1 py38he774522_1 pyzmq 19.0.2 py38ha925a31_1 qdarkstyle 2.8.1 py_0 qt 5.9.7 vc14h73c81de_0 qtawesome 1.0.1 py_0 qtconsole 4.7.7 py_0 qtpy 1.9.0 py_0 regex 2020.10.15 py38he774522_0 requests 2.24.0 py_0 rope 0.18.0 py_0 rtree 0.9.4 py38h21ff451_1 ruamel_yaml 0.15.87 py38he774522_1 scikit-image 0.17.2 py38h1e1f486_0 scikit-learn 0.23.2 py38h47e9c7a_0 scipy 1.5.2 py38h14eb087_0 seaborn 0.11.0 py_0 send2trash 1.5.0 py38_0 setuptools 50.3.1 py38haa95532_1 shapely 1.7.1 py38h13ff51f_5 conda-forge simplegeneric 0.8.1 py38_2 singledispatch 3.4.0.3 py_1001 sip 4.19.13 py38ha925a31_0 six 1.15.0 py38haa95532_0 snowballstemmer 2.0.0 py_0 sortedcollections 1.2.1 py_0 sortedcontainers 2.2.2 py_0 soupsieve 2.0.1 py_0 sphinx 3.2.1 py_0 sphinxcontrib 1.0 py38_1 sphinxcontrib-applehelp 1.0.2 py_0 sphinxcontrib-devhelp 1.0.2 py_0 sphinxcontrib-htmlhelp 1.0.3 py_0 sphinxcontrib-jsmath 1.0.1 py_0 sphinxcontrib-qthelp 1.0.3 py_0 sphinxcontrib-serializinghtml 1.1.4 py_0 sphinxcontrib-websupport 1.2.4 py_0 spyder 4.1.5 py38_0 spyder-kernels 1.9.4 py38_0 sqlalchemy 1.3.20 py38h2bbff1b_0 sqlite 3.33.0 h2a8f88b_0 statsmodels 0.12.0 py38he774522_0 sympy 1.6.2 py38haa95532_1 tblib 1.7.0 py_0 terminado 0.9.1 py38_0 testpath 0.4.4 py_0 threadpoolctl 2.1.0 pyh5ca1d4c_0 tifffile 2020.10.1 py38h8c2d366_2 tk 8.6.10 he774522_0 toml 0.10.1 py_0 toolz 0.11.1 py_0 tornado 6.0.4 py38he774522_1 tqdm 4.50.2 py_0 traitlets 5.0.5 py_0 typing_extensions 3.7.4.3 py_0 ujson 4.0.1 py38ha925a31_0 uk-covid19 1.2.2 pypi_0 pypi unicodecsv 0.14.1 py38_0 urllib3 1.25.11 py_0 vc 14.1 h0510ff6_4 vs2015_runtime 14.16.27012 hf0eaf9b_3 watchdog 0.10.3 py38_0 wcwidth 0.2.5 py_0 webencodings 0.5.1 py38_1 werkzeug 1.0.1 py_0 wheel 0.35.1 py_0 widgetsnbextension 3.5.1 py38_0 win_inet_pton 1.1.0 py38_0 win_unicode_console 0.5 py38_0 wincertstore 0.2 py38_0 winpty 0.4.3 4 wrapt 1.11.2 py38he774522_0 xlrd 1.2.0 py_0 xlsxwriter 1.3.7 py_0 xlwings 0.20.8 py38_0 xlwt 1.3.0 py38_0 xmltodict 0.12.0 py_0 xz 5.2.5 h62dcd97_0 yaml 0.2.5 he774522_0 yapf 0.30.0 py_0 zeromq 4.3.2 ha925a31_3 zict 2.0.0 py_0 zipp 3.4.0 pyhd3eb1b0_0 zlib 1.2.11 h62dcd97_4 zope 1.0 py38_1 zope.event 4.5.0 py38_0 zope.interface 5.1.2 py38he774522_0 zstd 1.4.5 h04227a9_0 ``` ### pip list ``` ```