Closed rayafratkina closed 4 years ago
Pinging @elastic/kibana-gis (Team:Geo)
This could be more generic, CSVs with geometries encoded as WKT are not that common, compared with having coordinates in separated columns (as the CSV also has).
In many tools that support geospatial fields and import CSVs, we could have a way to explicit how geometries are encoded.
To show an example, this is QGIS CSV import tool. It automatically found the WKT field and provided all the settings.
Here the form to assign x/long, y/lat columns UI
The import tool supports way more settings and possibilities but I wanted to show these settings in particular.
For what is worth, both CARTO BUILDER and ArcGIS Online cloud tools automatically detect those fields and the import is done without any intervention of the user.
This has been resolved in https://github.com/elastic/elasticsearch/issues/56967
Fileuploader service will now detect all WKT geometries. If every field is a POINT
geometry, it will be mapped as a geo_point
. All other WKT will be mapped as a geo_shape
@benwtrent this is great!!
While WKT is not a rare format (despite the name), having data in separate columns for latitude and longitude is a way more frequent use case. No idea if it would be possible to support, maybe adding some convention. For example this web importer assumes a few column names to try to find them
https://github.com/CartoDB/cartodb/blob/master/services/importer/lib/importer/ogr2ogr.rb#L18-L21
cc @nickpeihl @choobinejad
Agree - I see numeric latitude/longitude columns quite frequently in CSV files. A best-effort attempt to identify them in files and map them to a geo_point would remove a roadblock for users, and that's always a good thing!
Difficulty their is now each column is treated separately in determining the appropriate mapping. It might be enough to supply an easy override in the UI (here are my lat/long columns, make them a point) and then we can generate the pipeline + mapping from there.
@benwtrent I think your approach achieves the goal (improve geo_point file upload experience) while working with the tools we have now. ++.
I also think that the easy manual override you suggest could have expanded utility in the future (e.g. if there are 2 columns that have consistently valid decimal degree or degrees-minutes-seconds data, then guess that they together represent a geo_point... but give users the opportunity to correct that assumption if it's wrong using the manual override).
Describe the feature: I was playing with this dataset that has a lat_long field containing geo point data. When I load it via csv upload tool, the field type is automatically detected as string. If I change it to geo_point everything works correctly, but it would be nice to automatically detect right away.
Describe a specific use case for the feature: Autodetect data of the format POINT (-73.9570437717691 40.794850940803904) as geo_point and possibly provide a way to set custom format (similar to the one for date).