zhikrullah / pyshp

Automatically exported from code.google.com/p/pyshp
MIT License
0 stars 0 forks source link

Shapefile with other encoding than UTF-8 fails #36

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Instead of loading the shapefile, an error arises:
Traceback (most recent call last):
  File "/[path]/import_shapefile.py", line 115, in <module>
    loadShapefile()
  File "/[path]/import_shapefile.py", line 19, in loadShapefile
    records = sf.shapeRecords()
  File "/[path]/shapefile.py", line 430, in shapeRecords
    for rec in zip(self.shapes(), self.records())]
  File "/[path]/shapefile.py", line 413, in records
    r = self.__record()
  File "/[path]/shapefile.py", line 389, in __record
    value = u(value)
  File "/[path]/shapefile.py", line 53, in u
    return v.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc6 in position 1: invalid 
continuation byte

Using:
pyshp version: 1.1.4
Ubuntu 12.04
Python 3.2.3 from Ubuntu 12.04 repos

For me it works using the following patch:
==============
@@ -50,7 +50,7 @@
     if PYTHON3:
         if isinstance(v, bytes):
             # For python 3 decode bytes to str.
-            return v.decode('utf-8')
+            return v.decode('iso-8859-1')
         elif isinstance(v, str):
             # Already str.
             return v
==============

A better solution, I think, would be to make it possible for the user to define 
custom encoding when loading shapefile using arguments.

Original issue reported on code.google.com by ove.ande...@gmail.com on 29 Jun 2012 at 8:04

GoogleCodeExporter commented 8 years ago

Original comment by jlawh...@geospatialpython.com on 2 May 2013 at 5:01

GoogleCodeExporter commented 8 years ago
The Natural Earth shapefiles from http://www.naturalearthdata.com/downloads/ 
has all the text stored using the CP-1252 character set (aka Win-1252).

Original comment by teppe...@gmail.com on 15 Jan 2014 at 9:48