GeospatialPython / pyshp

This library reads and writes ESRI Shapefiles in pure Python.
MIT License
1.1k stars 259 forks source link

struct.error: unpack requires a bytes object of length 16 #174

Closed cmbasnett closed 5 years ago

cmbasnett commented 5 years ago

Using the following file: SHAPE.zip

Running this code:

import shapefile
sf = shapefile.Reader('/SHAPE')
print(sf)
sf.shapes()

print(sf) prints the following:

shapefile Reader
    1 shapes (type 'POLYGONZ')
    1 records (2 fields)

However, calling sf.shapes() yields the following error:

Traceback (most recent call last):
  File "[..]/scratch_16.py", line 76, in <module>
    sf.shapes()
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/shapefile.py", line 827, in shapes
    shapes.append(self.__shape())
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/shapefile.py", line 751, in __shape
    (mmin, mmax) = unpack("<2d", f.read(16))
struct.error: unpack requires a bytes object of length 16

This shape file was exported from Agisoft PhotoScan.

micahcochran commented 5 years ago

I tried opening this with QGIS and it seems like there is not any data in the shapefile.

I tried to zoom in, and nothing.

From QGIS Info about the layer

Name | SHAPE
-- | --
Path | C:\projects\datarequests\2018\2018-12-19\SHAPE.shp
Storage | ESRI Shapefile
Comment |  
Encoding | System
Geometry | Polygon (Polygon25D)
CRS | EPSG:4326 - WGS 84 - Geographic
Extent | Empty
Unit | degrees
Feature count | 1
doingalright commented 5 years ago

I have the very same issue with various shapefiles. Some of them are certainly not empty. It seems however that the error occurs after reading the last item, so an empty file might have the same result. Happened to me for PolyLineZ and PolygonZ

polylineZ.zip

doingalright commented 5 years ago

I did some investigating: In case of my file (1 line with 2 points) it becomes clear that shapefile.py is reading beyond the length of the record content. Record size (including header) is (4 + 56) *2 = 120 bytes. The M-part starts reading at position 120. I assume this only fails on the last record of the shapefile, but produces unexpected results on all of them. According to the shapefile whitepaper P.18 (https://www.esri.com/library/whitepapers/pdfs/shapefile.pdf) M-values are optional for PolylineZ, so the data producer (QGIS in my case) is not at fault. If the content length given by the header is considered when reading each record, I think it could be solved.

domlysz commented 5 years ago

Same issue here with polylinez and polygonz files created with QGIS

domlysz commented 5 years ago

It remember me this previous issue https://github.com/GeospatialPython/pyshp/issues/55

doingalright commented 5 years ago

For what it's worth, I could comment that section out and get it to work with my files. That obviously only works if you are not working with m-values.

karimbahgat commented 5 years ago

Thanks for pointing this out. In my previous reading of the spec, it seemed that m-values were only optional in the sense that one could specify a nodata value (#55), but I agree now that the values might also be absent entirely. And either way it would be good to respect the record length.

There was one place where we did check for recordlength before reading, but this did not apply for the mmin and mmax values, which is where it failed. Now fixed so that before reading any m-range or m-values we first check there is enough data left in the record. Tested that it now works on the shapefiles supplied here. Should be included in the next release.

karimbahgat commented 5 years ago

Included now in version 2.1.0