Closed tbuckl closed 9 years ago
i think this:
SELECT p2.gid
FROM parcels p1, parcels p2
WHERE ST_Equals(p1.geom, p2.geom) AND
p1.gid <> p2.gid;
might be more efficient. will check the results against the other.
@janowicz @mkreilly Was the intention of stacking to group parcels that share an identical bounding box? That seems to be what the GROUP BY below does.
See: http://postgis.net/docs/manual-2.1/ST_Geometry_EQ.html
I will ask someone that knows more to double-check.
If not, then we could substitute with ST_EQUALS, which checks the identity of the entire geometry.
It seems to be a question of precision though, because ST_EQUALS can be subject to rounding errors.
The intention was, as you said, to group parcels with the same geometry. Ah, good to know that group by geom means only that their bounding boxes are the same!
On Tue, Jun 30, 2015 at 10:31 AM, Tom Buckley notifications@github.com wrote:
= != =
— Reply to this email directly or view it on GitHub https://github.com/synthicity/bayarea_urbansim/issues/53#issuecomment-117271909 .
@janowicz ok well for posterity, the GROUP BY clause actually performs better for what the intention is. i think this is because of the rounding errors in st_equals. perhaps we should think about using the hash of the bounding box as an identifier in the future?
here's an example of that, for posterity. the stacked_not_identical (shown as brownish in here -- blue+red) parcels are parcels that are = (group by) but not st_equals (to some other parcel in the parcels table).
i'll leave this issue open because i do still wonder if we can improve that query.
this query produces very similar results and makes use of indexes. i'll submit a PR with it:
SELECT p1.* into stacked
FROM parcels p1, parcels p2
WHERE p1.geom && p2.geom
AND p1.geom=p2.geom AND
p1.gid <> p2.gid;
@buckleytom Thanks!
This pull request closes this issue:
this query
https://github.com/MetropolitanTransportationCommission/bayarea_urbansim/blob/master/data_regeneration/geom_aggregation.py#L241-L243
has a very costly query plan:
we can probably use filters and postgis st_equals here:
for example, like this: http://stackoverflow.com/questions/18769250/finding-multiple-duplicates-in-postgres