planetlabs / gpq

Utility for working with GeoParquet
https://planetlabs.github.io/gpq/
Apache License 2.0
135 stars 7 forks source link

Not able to convert a geojson file #142

Closed aborruso closed 5 months ago

aborruso commented 5 months ago

Hi, when I run this gpq convert --from="geojson" tmp.geojson tmp.parquet I have

gpq: error: failed to generate converter from first 100 features

It's a 100 Mb geojson that I have created using ogr2ogr and a input shp file.

What can I do to solve the problem?

Thank you

tschaub commented 5 months ago

Hi @aborruso - thanks for the report. This would happen if gpq failed to determine a parquet schema for the features after reading the first 100. If the first 100 features all have a property where the value is null, for example, a suitable type cannot be determined for that property. One thing you can try is to try to read more features before giving up (e.g. gpq convert --max 1000 tmp.geojson tmp.parquet).

When ogr is doing a similar conversion, I think that it assumes the type for the property should be a string. Then when writing any values, it stringifies anything it encounters as a JSON string.

It would be useful if gpq provided more detailed output in situations like this (listing all of the properties/fields for which it cannot determine a type, for example). Until then, can you examine your data to see if there might be a lot of features at the start of the collection with null values for one or more of the properties?

aborruso commented 5 months ago

Thank you very much