NYCPlanning / db-facilities

šŸ­ šŸ¢ šŸ¬ šŸ£ šŸ¤ šŸ„ šŸ¦ šŸØ šŸŖ šŸ« šŸ©
https://nycplanning.github.io/db-facilities
0 stars 0 forks source link

facdb_boro: assigning boro and borocode #397

Closed SPTKL closed 3 years ago

SPTKL commented 3 years ago

depends on: facdb_geom facdb_base

Possible sources for boro and borocode:

  1. source boro, borocode
  2. source zipcode -> lookup to get borocode
  3. source bin and bbl -> first digit
  4. geosupport borocode
  5. geosupport bin
  6. geosupport bbl
  7. geosupport zipcode
  8. spatial join

Priority?

pending ....


SELECT 
    uid,
    UPPER(coalesce(b.county, left(a.boro, 10))) as boro,
    input_zipcode as zipcode,
    UPPER(b.po_name) as city
FROM (
    SELECT
        uid, UPPER(boro) as boro, borocode, geo_1b->'inputs'->>'input_zipcode' as input_zipcode
    FROM facdb_base
    WHERE uid not in (SELECT uid FROM facdb_spatial where borocode IS NOT NULL)
) a LEFT JOIN doitt_zipcodeboundaries b 
ON a.input_zipcode = b.zipcode;
> we would also need to create lookup boro, that has all the spelling and format of boro ```sql SELECT uid, boro, borocode, bin, bbl, geo_1b->'inputs'->>'input_borough' as input_borough, geo_1b->'inputs'->>'input_zipcode' as input_zipcode, nullif(geo_1b->'result'->>'geo_borough_code','') as geo_borough_code, nullif(geo_1b->'result'->>'geo_bin','') as geo_bin, nullif(geo_1b->'result'->>'geo_bbl','') as geo_bbl FROM facdb_base ``` Potential sources of borough fields: - source boro, borocode, bbl, bin (first digit), zipcode (based on lookup) - geosupport borocode, bin, bbl - spatial join to get boro/borocode ### There are many cases where borough code is null but zipcode is not null ```sql select * from ( SELECT uid, boro, borocode, left(bin, 1) as borocode_bin, left(bbl, 1) as borocode_bbl, nullif(geo_1b->'result'->>'geo_borough_code','') as borocode_geo, nullif(geo_1b->'result'->>'geo_zip_code','') as zipcode_geo, left(nullif(geo_1b->'result'->>'geo_bin',''), 1) as borocode_geo_bin, left(nullif(geo_1b->'result'->>'geo_bbl',''), 1) as borocode_geo_bbl FROM facdb_base ) a where borocode_geo is null and zipcode_geo is not null ```
AmandaDoyle commented 3 years ago

AD ideas for priority of assigning spatial attributes 1) If record gets a hit in Geosupport and and BIN or BBL is returned take all spatial attributes from Geosupport. If a spatial attribute is not returned by Geosupport do not try to backfill. 2) If a record does not get a hit in Geosupport, but we were able to create spatial data and join the dataset to a tax lot take all spatial data from PLUTO. We need to decide how to get NTA and transform censtract to proper format. A record should not join to PLUTO and not get a hit in Geosupport. 2) Take all spatial attributes via spatial join, if a record does not get a hit in Geosupport, but we were able to create spatial data. 3) Take attributes from source data if it exists.

Goals: 1) Ensure spatial attributes are consistent across the record. We'll need to run QAQC checks. Currently, there are 41 records in FacDB where borocode <> LEFT(bbl,1), 67 records where borocode <> LEFT(commboard,1), and more; therefore, you want all spatial attributes coming from the same source. 2) Do not mix sources for spatial data. 3) Have valid spatial data attributes. Right now, there are 4 records where borocode = 0

Mapping enhancement If a point falls within the water, use a clipped boundary (i.e. clipped mappluto) to map the point to land. Points in water have caused issues with assigning the proper spatial boundary attributes.

Spatial boundary attributes are:

Address attributes are

Outlier

SPTKL commented 3 years ago

@AmandaDoyle we don't collect source data spatial attributes (except for boro, borocde, zipcode, which can be very wrong and inconsistent), so 4 is not possible

AmandaDoyle commented 3 years ago

@SPTKL See edited comment, which was updated based on my conversation with LS. Yesterday, when I was reviewing the dataset the record GRAND FERRY PARK probably caused me the most grief and I definitely want to avoid this from happening in the new and improved FacDB