mapbox / csv2geojson

magically convert csv files to geojson files
http://mapbox.github.io/csv2geojson/
MIT License
362 stars 82 forks source link

Numeric fields being enclosed in double quotes #31

Closed nkinkade closed 8 years ago

nkinkade commented 9 years ago

I have some CSV input that looks like this:

latitude,longitude,downloadThroughput 47.48910140991211,-122.29080200195312,16.585082892477303 47.48910140991211,-122.29080200195312,11.732173309099407 47.606201171875,-122.33209991455078,13.501345759285433

The resulting GeoJSON output adds downloadThroughput as a property, but enclosed in double quotes. This in itself isn't necessarily a problem, except that it seems that Turf.js refuses to do math operations on what it thinks is a string. It seems that csv2geojson should not make strings out of purely numeric data. When I convert my CSV file using, for example, the following tool, downloadThroughput is not wrapped in quotes and turf.average() works as expected.

http://www.convertcsv.com/csv-to-geojson.htm

Am I perhaps doing something wrong?

tmcw commented 9 years ago

CSV does not have string or number types - see, for instance, that the downloadThroughput column isn't quoted, but is a string. csv2geojson shouldn't try to guess types, because there'll always be people who want numbery strings as strings and stringy numbers as numbers. Probably the solution is, like the converter you linked, to let people specify types explicitly.

nkinkade commented 9 years ago

I would imagine that csv2geojson could make some reasonable default assumptions about intent based on the input data. A default rule could be as simple as everything is a string, unless it's unquoted data with nothing but numeric data. That doesn't seem any more presumptuous, at least to me, than treating all input as a string no matter what. Having defaults that make some decisions based on standard convention about type seems good to me. But also, as you say, being able to explicity define types for when defaults aren't good enough would be very useful. On the other hand, at least for now, I can just iterate through the object, manually casting downloadThroughput to a number.

tmcw commented 9 years ago

I've been through this before, and am absolutely certain that inferring types is a trap. I'll add explicit conversions.

tmcw commented 8 years ago

Closing; parsing values as ints or floats should be done downstream in applications that use csv2geojson, not in the library itself.

andrewharvey commented 7 years ago

One implication of this is the Mapbox Studio Datasets CSV import defaults to string and creates issues in Studio when trying to filter between numbers etc.