GeospatialPython / pyshp

This library reads and writes ESRI Shapefiles in pure Python.
MIT License
1.1k stars 259 forks source link

pyshp v. 1.2.11 reads non-string field values as None #108

Closed martinburch closed 7 years ago

martinburch commented 7 years ago

This is a regression from 1.2.10, see https://gis.stackexchange.com/questions/252104/python-pyshp-reads-some-field-values-as-none?noredirect=1#comment396899_252104

Reading the records of AL082017_pts.shp in a recent hurricane path file with pyshp results in a lot of None.

Here's the first record:

sf = shapefile.Reader('AL082017_pts.shp') print sf.record(0) Output:

['GENESIS014', None, None, '08', None, '0600', None, 'al', None, 'DB', None, None, None, None] The correct field values, as can be seen in the dbf, are:

GENESIS014 2017080206 2017 08 2 0600 1012 al 8 DB 20 0 9.5 -13

karimbahgat commented 7 years ago

Indeed the issue was numeric fields, but the issue seems to be twofold.

On the one hand, pyshp usually reads numeric fields, but in this case it failed because the driver that wrote the shapefile wrote int fields "incorrectly" as floats, that is with the ".0" at the end. The numeric fields that fail are all defined as ints (decimal=0), so pyshp tries to convert the string to int, but calling int on a float string doesn't work.

On the other hand, this was caused by an attempt in v1.2.11 to root out errors in the reading of field types and make it more robust. In doing so it was made a lot stricter in the expected values, assuming perfectly written shapefiles.

This seems to be the inverse problem of #99, where users had trouble writing ints or strings to float fields etc.

I think the best solution now is to make it backwards compatible with the more lenient approach of v1.2.10, and handle minor data type errors in shapefiles by adding forced conversion. This has now been fixed in the 1.2.x branch and the master branch on GitHub, and will be included in release 1.2.12 shortly and the upcoming major version 2.0.0.

karimbahgat commented 7 years ago

Version 1.2.12 which fixes this issue is now up on PyPI.