GeospatialPython / pyshp

This library reads and writes ESRI Shapefiles in pure Python.
MIT License
1.1k stars 259 forks source link

Parsing of N fields #136

Closed desalema70 closed 6 years ago

desalema70 commented 6 years ago

Hi!

I'm using your library to extract data from shapefiles and I have the following problem: many "N" field values in the DBF portion of the data I'm trying to extract are nul-terminated and unfortunately followed by what looks like garbage up to the maximum field length. For instance, a field is defined as N 5.2 (8 chars), and has the value "05.11\0PA": the last two characters are garbage, obviously, and trip up the parser into returning None as the field's value.

I am not too familiar with the DBF format but looking at the code I see that there is some support for nul-terminated strings, where the nul is just removed (shapefile.py line 501). I wonder if it might be possible to clip the string instead, which would increase the library's tolerance to unfortunate data. The same data, by the way, is readable by a number of other (non-python) libraries, so I don't think this is likely to cause a compatibility problem. So, instead of doing this: value = value.replace(b('\0'), b('')).strip() I propose this: value = value.split('\0')[0]

Would this be acceptable? I've tried it locally and it works as expected (for my use-case).

BTW I use version 1.2.11, but it seems to be the same in the latest release.

karimbahgat commented 6 years ago

This looks like a great solution, thanks 👍! Implemented in 4064f5ce8de13d93bd9fce3230e16c1f22111a68.