NYCPlanning / db-facilities

🏭 🏢 🏬 🏣 🏤 🏥 🏦 🏨 🏪 🏫 🏩
https://nycplanning.github.io/db-facilities
0 stars 0 forks source link

Taking zipcode from spatial join #562

Closed SashaWeinstein closed 2 years ago

SashaWeinstein commented 2 years ago

Zipcode from Geosupport 1B function

facDB works as expected right now, taking zip code from a spatial operatiob in cases where there is no zip code in the source data. So why did I write up this issue? Originally I thought the zipcode came from the geo_1b results, which was confusing. But it actually does not. I'm opening this issue as it may be helpful resource in the short term and then we can close

How this issue was identified

The dsny_textiledrop dataset was switched from textile dropoff locations to DSNY DonateNYC Directory. This new dataset has no zip code, and I didn't know how to change the SQL query in dsny_textiledrop.sql.

Current way missing zip codes are filled in

We know that records from source data that has no zip code can have zip codes in the final facdb. For example the dot_bridgehouses dataset has this code in dot_bridgehouses.sql:

SELECT
    uid,
    source,
    site as facname,
    parsed_hnum as addressnum,
    parsed_sname as streetname,
    raw_address as address,
    NULL as city,
    NULL as zipcode,
    boroname as boro,

...

INTO _dot_bridgehouses

But his query

with dot_bridgehouse_zip as (SELECT zipcode IS NULL as has_zip from facdb WHERE datasource = 'dot_bridgehouses') SELECT has_zip, count(*) from dot_bridgehouse_zip GROUP BY has_zip

shows that around 1/3 of the records do have zip codes. So we know that the zip codes are added later downstream. Where do these zip codes come from?

The _create_facdb_spatial.sql file has this code on line 29:

(select zipcode from doitt_zipcodeboundaries b where st_intersects(b.wkb_geometry, a.geom) limit 1) as zipcode,

This is where I think believe those zipcodes is added