CartoDB / bigmetadata

BSD 3-Clause "New" or "Revised" License
43 stars 11 forks source link

Gaps in block_group_clipped boundaries in NYC #38

Closed talos closed 8 years ago

talos commented 8 years ago

Query:

SELECT geom_refs, the_geom, st_transform(the_geom, 3857) the_geom_webmercator FROM 
OBS_GetBoundariesByGeometry(
ST_SetSRID((SELECT ST_MakeEnvelope(MIN(ST_X(the_geom)),
                MIN(ST_Y(the_geom)),
                MAX(ST_X(the_geom)),
                MAX(ST_Y(the_geom))) FROM jkrauss.nyc), 4326),
    'us.census.tiger.block_group_clipped')

Looks like:

screen shot 2016-05-19 at 10 49 31 am
talos commented 8 years ago

Issue seems to be in the clipping process here:

https://github.com/CartoDB/bigmetadata/blob/master/tasks/us/census/tiger.py#L586

In order to eliminate waterlogged artifacts, very small & simple geometries were eliminated. This works when clipping larger geometries, but starts to break in urban block groups since some block groups are smaller than 5000 square meters and have fewer than ten points.

talos commented 8 years ago

Turns out it's not the 5000 sq m rule -- the smallest block group in the country is 5494.84281786486 sq meters. It's simplicity -- the simplest has only 5 points.