matplotlib / basemap

Plot on map projections (with coastlines and political boundaries) using matplotlib
MIT License
775 stars 392 forks source link

“UnicodeDecodeError” when using the function “readshapefile” #449

Closed JingwangLi closed 5 years ago

JingwangLi commented 5 years ago

When I read a shapefile by using “readshapefile”, the error “UnicodeDecodeError” happened,like this: “UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 0: invalid continuation byte” I realized the encoding type of the shapefile isn’t “utf-8”, so I read the documentation of Basemap. Here: https://matplotlib.org/basemap/api/basemap_api.html readshapefile(shapefile, name, drawbounds=True, zorder=None, linewidth=0.5, color='k', antialiased=1, ax=None, default_encoding='utf-8') Then I found the parameter “default_encoding=’utf-8’”, so I changed this parameter as “default_encoding=’ANSI’”. But the error didn’t change, still “UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 0: invalid continuation byte”. So, this is the problem: I already changed the parameter, if the encoding type of the shapefile isn’t “ANSI”, the error should be “UnicodeDecodeError: 'mbcs' codec can't decode bytes in position 0—1” instead of “utf-8”. So is the parameter “default_encoding=’utf-8’” fixed? I can’t change the parameter??

tomerburg commented 5 years ago

I have a similar error to this, only when I attempt to plot counties, which can also be traced back to shapefile.py. This started when I reinstalled Anaconda last night (using python 3.6, matplotlib 3.0.2 and basemap 1.2.0). I tried changing to basemap 1.1.0 and it still didn't work.

File "plot_all.py", line 164, in m.drawcounties(linewidth=0.1,color='#222222',zorder=10) File "C:\Users\Tomer\Anaconda3\lib\site-packages\mpl_toolkits\basemap__init.py", line 1981, in drawcounties default_encoding='latin-1',drawbounds=drawbounds) File "C:\Users\Tomer\Anaconda3\lib\site-packages\mpl_toolkits\basemap\init.py", line 2148, in readshapefile for shprec in shf.shapeRecords(): File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 999, in shapeRecords for rec in zip(self.shapes(), self.records())] File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 971, in records r = self.record(oid=i) File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 946, in __record value = u(value, self.encoding, self.encodingErrors) File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 104, in u return v.decode(encoding, encodingErrors) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 2: invalid continuation byte

JingwangLi commented 5 years ago

I have a similar error to this, only when I attempt to plot counties, which can also be traced back to shapefile.py. This started when I reinstalled Anaconda last night (using python 3.6, matplotlib 3.0.2 and basemap 1.2.0). I tried changing to basemap 1.1.0 and it still didn't work.

File "plot_all.py", line 164, in m.drawcounties(linewidth=0.1,color='#222222',zorder=10) File "C:\Users\Tomer\Anaconda3\lib\site-packages\mpl_toolkits\basemapinit.py", line 1981, in drawcounties default_encoding='latin-1',drawbounds=drawbounds) File "C:\Users\Tomer\Anaconda3\lib\site-packages\mpl_toolkits\basemapinit.py", line 2148, in readshapefile for shprec in shf.shapeRecords(): File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 999, in shapeRecords for rec in zip(self.shapes(), self.records())] File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 971, in records r = self.record(oid=i) File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 946, in record value = u(value, self.encoding, self.encodingErrors) File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 104, in u return v.decode(encoding, encodingErrors) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 2: invalid continuation byte

Yes, the parameter "default-encoding" doenn't work, so it can only read the shapefile which is encoding by "utf-8". I already emailed the author. Maybe he is fixing it.

WeatherGod commented 5 years ago

pyshp apparently changed how they accept the encoding parameter, so it is now broken. I hope to get a fix out later today.

On Thu, Jan 3, 2019 at 2:02 AM Jingwang Li notifications@github.com wrote:

I have a similar error to this, only when I attempt to plot counties, which can also be traced back to shapefile.py. This started when I reinstalled Anaconda last night (using python 3.6, matplotlib 3.0.2 and basemap 1.2.0). I tried changing to basemap 1.1.0 and it still didn't work.

File "plot_all.py", line 164, in m.drawcounties(linewidth=0.1,color='#222222',zorder=10) File "C:\Users\Tomer\Anaconda3\lib\site-packages\mpl_toolkits\basemapinit.py", line 1981, in drawcounties default_encoding='latin-1',drawbounds=drawbounds) File "C:\Users\Tomer\Anaconda3\lib\site-packages\mpl_toolkits\basemapinit.py", line 2148, in readshapefile for shprec in shf.shapeRecords(): File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 999, in shapeRecords for rec in zip(self.shapes(), self.records())] File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 971, in records r = self.record(oid=i) File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 946, in record value = u(value, self.encoding, self.encodingErrors) File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 104, in u return v.decode(encoding, encodingErrors) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 2: invalid continuation byte

Yes, the parameter "default-encoding" doenn't work, so it can only read the shapefile which is encoding by "utf-8". I already emailed the author. Maybe he is fixing it.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/matplotlib/basemap/issues/449#issuecomment-451068081, or mute the thread https://github.com/notifications/unsubscribe-auth/AARy-JpxCte2ux7feTrlZxoj5KtS8ayoks5u_asMgaJpZM4ZhnK3 .

JingwangLi commented 5 years ago

pyshp apparently changed how they accept the encoding parameter, so it is now broken. I hope to get a fix out later today. On Thu, Jan 3, 2019 at 2:02 AM Jingwang Li @.***> wrote: I have a similar error to this, only when I attempt to plot counties, which can also be traced back to shapefile.py. This started when I reinstalled Anaconda last night (using python 3.6, matplotlib 3.0.2 and basemap 1.2.0). I tried changing to basemap 1.1.0 and it still didn't work. File "plot_all.py", line 164, in m.drawcounties(linewidth=0.1,color='#222222',zorder=10) File "C:\Users\Tomer\Anaconda3\lib\site-packages\mpl_toolkits\basemapinit.py", line 1981, in drawcounties default_encoding='latin-1',drawbounds=drawbounds) File "C:\Users\Tomer\Anaconda3\lib\site-packages\mpl_toolkits\basemapinit.py", line 2148, in readshapefile for shprec in shf.shapeRecords(): File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 999, in shapeRecords for rec in zip(self.shapes(), self.records())] File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 971, in records r = self.record(oid=i) File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 946, in record value = u(value, self.encoding, self.encodingErrors) File "C:\Users\Tomer\Anaconda3\lib\site-packages\shapefile.py", line 104, in u return v.decode(encoding, encodingErrors) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 2: invalid continuation byte Yes, the parameter "default-encoding" doenn't work, so it can only read the shapefile which is encoding by "utf-8". I already emailed the author. Maybe he is fixing it. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#449 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/AARy-JpxCte2ux7feTrlZxoj5KtS8ayoks5u_asMgaJpZM4ZhnK3 .

Thanks a lot! I can't wait to use Basemap.

WeatherGod commented 5 years ago

Writing up the fix now. Can someone send me a shapefile that breaks so I can add a unit test?

n0skill commented 5 years ago

Hello, I seem to have the same problem with pyshp.

Python 3.7.2 (default, Jan 10 2019, 23:51:51) 
[GCC 8.2.1 20181127] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import shapefile as shp
>>> shapefile = shp.Reader("Restrictions_for_Drones.shp")
>>> for shape in shapefile.shapeRecords():
...     print(shape)
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/noskill/projects/airaid/software/dev/lib/python3.7/site-packages/shapefile.py", line 1039, in shapeRecords
    for rec in zip(self.shapes(), self.records())])
  File "/home/noskill/projects/airaid/software/dev/lib/python3.7/site-packages/shapefile.py", line 1012, in records
    r = self.__record(oid=i)
  File "/home/noskill/projects/airaid/software/dev/lib/python3.7/site-packages/shapefile.py", line 987, in __record
    value = u(value, self.encoding, self.encodingErrors)
  File "/home/noskill/projects/airaid/software/dev/lib/python3.7/site-packages/shapefile.py", line 104, in u
    return v.decode(encoding, encodingErrors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 28: invalid start byte

Here's the zip from which the shapefile comes: data.zip

xigrug commented 5 years ago

I also have this problem how to fix it ?

johnrobertlawson commented 5 years ago

Hey @WeatherGod - did you implement the fix? Much appreciated.

WeatherGod commented 5 years ago

Give https://github.com/matplotlib/basemap/pull/459 a try. I haven't had time to make a unit test to verify.

johnrobertlawson commented 5 years ago

Thanks, I'll take a look.