GeospatialPython / pyshp

This library reads and writes ESRI Shapefiles in pure Python.
MIT License
1.1k stars 259 forks source link

Possible bug during conversion from shp to json #175

Closed karanchawla closed 4 years ago

karanchawla commented 5 years ago

I'm trying to convert a shape file and using the following example code:

reader = shapefile.Reader("NaturalGas_Pipelines_US_201804.shp")
fields = reader.fields[1:]
field_names = [field[0] for field in fields]
buffer = []
for sr in reader.shapeRecords():
    atr = dict(zip(field_names, sr.record))
    geom = sr.shape.__geo_interface__
    buffer.append(dict(type="Feature", \
    geometry=geom, properties=atr)) 

# write the GeoJSON file
geojson = open("natural_gas.json", "w")
print(geojson)
geojson.write(dumps({"type": "FeatureCollection", "features": buffer}, indent=2) + "\n")
geojson.close()

However, this throws an error as follows:

 217                         coordinates.append(tuple([tuple(p) for p in self.points[ps:part]]))
    218                         ps = part
--> 219                 else:
    220                     coordinates.append(tuple([tuple(p) for p in self.points[part:]]))
    221                 return {
karanchawla commented 5 years ago

I resolved this using the following change to shapefile.py:

for part in self.parts:
    if part is not None:   
        if ps == None:
            ps = part
            continue
          else:
              coordinates.append(tuple([tuple(p) for p in self.points[ps:part]]))
              ps = part
karimbahgat commented 5 years ago

Hi @karanchawla , could you provide the full error traceback message, plus the shapefile? Which PyShp version would also be useful. Otherwise no way to know what went wrong here. You only pointed to a part of the code and an arrow towards the "else" statement.

zhik commented 5 years ago

@karimbahgat this is the traceback message.

Traceback (most recent call last):
  File "...\scripts\main.py", line 132, in <module>
    main()
  File "...\scripts\main.py", line 41, in main
    geometry = feature.shape.__geo_interface__
  File "...\scripts\pyshp.py", line 236, in __geo_interface__
    coordinates.append(tuple([tuple(p) for p in self.points[part:]]))
UnboundLocalError: local variable 'part' referenced before assignment

The error happens when a MultiLineString has no parts and it runs the else statement (Line 225). When it tries to execute line 236, but doesn't have a reference to the part variable

As @karanchawla suggested removing line 235 and 236 works as a temporary fix.

else:
    coordinates.append(tuple([tuple(p) for p in self.points[part:]]))

Or adding a case to skip when self.parts == 0

else:
    if len(self.parts) != 0:    
        coordinates.append(tuple([tuple(p) for p in self.points[part:]]))
erwinfeser commented 4 years ago

+1 I am getting the same error

karimbahgat commented 4 years ago

Thanks for raising this. I think the assumption in the code was that shapes with no coordinates should really be specified as NULL-shapes. However, it's clear from your cases that "empty" geometries do often occur. While the geojson spec does not define a proper null-geometry type, it does state that empty geojson coordinates may be interpreted as a null-geometry.

Fixing this by specifically returning geojson with empty coordinates if parts is empty, and handling this for the other shape types as well (polygons, points, multipoints).

karimbahgat commented 4 years ago

I would also note that it's not necessary to manually constructing the geojson for each feature, and then the geojson for the featurecollection, the ShapeRecord and Reader classes supports the __geo_interface__ directly. So instead, the same code could be written much shorter like:

reader = shapefile.Reader("NaturalGas_Pipelines_US_201804.shp")
geojson = open("natural_gas.json", "w")
geojson.write(dumps(reader.__geo_interface__, indent=2) + "\n")
geojson.close()

This will fail in the current version due to the wrong handling of empty coordinates, but should work in the next version once the fix is incorporated.