SEL-Columbia / sequencer

Python library for sequencing the output of Network Planner csv's and shape file outputs
Other
4 stars 4 forks source link

nx.write_shp outputs numerics as strings #31

Closed blogle closed 10 years ago

blogle commented 10 years ago

The method doesn't take kwargs so I will likely have to dig into gdal/osgeo for a solution.

Issue spotted by @carbz

blogle commented 10 years ago

looked at the networkx.write_shp source and found the culprit

# Conversion dict between python and ogr types
    OGRTypes = {int: ogr.OFTInteger, str: ogr.OFTString, float: ogr.OFTReal}

    # Edge loop
    for e in G.edges(data=True):
        data = G.get_edge_data(*e)
        g = netgeometry(e, data)
        # Loop through attribute data in edges
        for key, data in e[2].iteritems():
            # Reject spatial data not required for attribute table
            if (key != 'Json' and key != 'Wkt' and key != 'Wkb'
                and key != 'ShpName'):
                  # For all edges check/add field and data type to fields dict
                    if key not in fields:
                  # Field not in previous edges so add to dict
                        if type(data) in OGRTypes:
                            fields[key] = OGRTypes[type(data)]
                        else:
                            # Data type not supported, default to string (char 80)
                            fields[key] = ogr.OFTString

Ultimately, the method expects your data to be either int, string, or float. All of the attributes that I am storing in the shape file are looked up from a pandas data frame which stores them as numpy types. numpy.int32 != int, so nx.write_shp casts everything to string before outputting as a binary. The fix should simply be to cast these to native python types before calling the method, patching now and will make a PR in the next few minutes.