GeospatialPython / pyshp

This library reads and writes ESRI Shapefiles in pure Python.
MIT License
1.1k stars 260 forks source link

Reading multiple records fails #204

Closed SpatialDigger closed 3 years ago

SpatialDigger commented 4 years ago

The following does not read the records.

records = sf.records()

returns

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Users\garyn\PycharmProjects\shp_viewer\venv\lib\site-packages\shapefile.py", line 1249, in records
    r = self.__record(oid=i)
  File "C:\Users\garyn\PycharmProjects\shp_viewer\venv\lib\site-packages\shapefile.py", line 1224, in __record
    value = u(value, self.encoding, self.encodingErrors)
  File "C:\Users\garyn\PycharmProjects\shp_viewer\venv\lib\site-packages\shapefile.py", line 104, in u
    return v.decode(encoding, encodingErrors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc5 in position 0: invalid continuation byte

Single record works sf.record(3)

utf-8 issue?

karimbahgat commented 4 years ago

Yes, that probably means your file isn't endoded in utf8, most likely in latin encoding instead, unless you're dealing with some non-english language. You can set the expected file encoding when creating the reader:

sf = shapefile.Reader(path, encoding='latin')
records = sf.records()

Alternatively, if that doesn't work you can bypass and ignore encoding errors, eg through encodingErrors='ignore'.