Open timwis opened 7 years ago
Ideas: https://carto.com/blog/geoprocessing-in-postgis https://carto.com/blog/nearest-neighbor-joins https://carto.com/blog/lateral-joins
Upon a recent conversation with @pramsey , we would be best off doing some kind of ST_Dwithin using the_geom_webmercator
and that trigonometry outlined in the 2nd example of the "nearest neighbor" blog post. Even though that might "look complicated", it keeps the query indexed (ie, faster than casting to geography) and allows the user to specify their radius units in actual meters, rather than other projection units like degrees/radians/etc. Also more likely to be accurate seeing as we're considering a small area (Philadelphia county).
The PostGIS "order by distance" <->
operator is pretty dead simple though, too.
To be honest, I don't really understand why the_geom_webmercator
exists and when to use it over the_geom
. It's really hard to write this guide without that context :-/
the_geom_webmercator exists because all tiles that come out of carto are rendered in webmercator coordinates. This column is precaclulated, saving the step of converting coordinates from wgs84 to webmercator when a tile is requested
On Sun, Apr 9, 2017 at 11:36 Tim Wisniewski notifications@github.com wrote:
To be honest, I don't really understand why the_geom_webmercator exists and when to use it over the_geom. It's really hard to write this guide without that context :-/
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/CityOfPhiladelphia/carto-api-explorer/issues/4#issuecomment-292793115, or mute the thread https://github.com/notifications/unsubscribe-auth/ABv7XGbh8NaRGwgi253odUnve5afhlCwks5ruPrJgaJpZM4MjkDx .
That makes sense, and sounds quite simple, but why does it make sense to use that field for some geospatial queries over the_geom
? It seems like it has something to do with geometry vs geography?
The the_geom
column is a geometry type with an "SRID" of WGS84 latitude/longitude. The units of that field are degrees and as such any ST_Buffer or other PostGIS calculation would need to be done in degrees and/or radians. That usually doesn't make sense (do you know how many radians you want your buffer to be?).
the_geom_webmercator
column is a geometry but of SRID "Web Mercator". As @chriswhong says, it is there for a quick "cached" version of the geometry ready to be rendered into maps by CARTO. Web mercator's units are "web mercator meters", which are like regular meters at the equator but get a bit distorted at different latitudes. So, ST_Buffer and other PostGIS calculations can be done on this field using regular meters (hooray, mere mortals can know if we want a 50 meter buffer or a 100 meter one), but with a little extra trigonometry to account for the distortion at different latitudes.
Both the_geom
and the_geom_webmercator
columns have a geometric index on them, making geometric calculations in their units pretty fast.
ST_Buffer and most PostGIS functions support straight calculations in regular meters, but only with geography type columns, which are not created by default on CARTO tables (would be a pretty big architecture change, though geography can be added to tables via SQL API). Geographic calculations are slower than geometric calculations. They can be sped up with a geographic index (also possible via SQL API), but I think they still won't be as fast as a geometric-index. Something like ST_Buffer(the_geom::geography, meters)
works and gets you your meters, but involves casting the field to a different type than it is and doing away with the default geometric index, both of which will make the query run slower. Not a big deal for small data, but a bigger deal for bigger queries.
Doing calcs with the_geom_webmercator
and the requisite trigonometry is probably the easiest and surest-fire to be able to use meters with fast queries. Or you can cast with the_geom::geography
, but this will be slower. Other solutions would be adding extra geographic columns or indexes, but paying the disk space/index-generation costs associated with that...
Ooooh. GIS is hard. That's a great explanation @andrewbt, I think I understand well now. Thank you!
If you're concerned about ease of use/accessibility to the public @timwis , I could see us developing PL/PgSQL functions for your account (easy, just send CREATE FUNCTION
syntax through the SQL API with an API Key to save) that "wrap" all the_geom_webmercator trigonometry stuff. You could have something like select * from crimes where philly_buffer_func('POINT(-75.1 45.2)', 100)
that wraps an st_dwithin()
call with the_web_mercator and trig and all that. Maybe even convert feet to meters...
The downside here is that obviously philly_buffer_func()
is no longer standard PostGIS and users would have to recognize they couldn't easily port that query to other PostGIS systems...
actually, the best way to do that would be some kind of re-projection into a feet-units based projection like PA state plane: http://gis.stackexchange.com/questions/131363/choosing-srid-and-what-is-its-meaning
Each field type shows an example query. For strings, dates, etc., this makes sense: it fetches a sample value from the data and shows a basic comparison query. But it doesn't make much sense for geometry fields, because it fetches a WKB value. And how often would you be looking for records at an exact point? (as opposed to "near" a point). This is what the example query currently looks like:
I'd love to do something like "within a radius," but that's actually quite complicated to do in postgis (and is explained later anyway). Is there a simple, common postgis query that we can put up top?
/cc @awm33 @andrewbt
EDIT: We also have to consider that this example query should apply to points and to polygons. Unless there's a way to detect the geometry type? (EDIT:
GeometryType(the_geom)
)