GreenInfo-Network / nyc-crash-mapper-etl-script

Extract, Transform, and Load script for fetching new data from the NYC Open Data Portal's vehicle collision data and loading into the NYC Crash Mapper table on CARTO.
3 stars 0 forks source link

Vehicle Type Crosswalk examination tool #23

Closed gregallensworth closed 4 years ago

gregallensworth commented 4 years ago

A followup to issue #22 in which hasvehicle_XXX fields are created and populated.

The vehicletype_crosswalk_prod table contains NYC vehicle types codes, which are a wide variety of typographical variants, to be remapped onto a standardized set of 8 vehicle codes.

It is expected that new variants will pop up from time to time, e.g. a new misspelling of the word "BICYCLE" or a wholly-new alias such as "Delivery Van". As a result, newly incoming records would fail to be tagged because the alias was not known.

Thus, a tool to detect such discrepancies would have two parts:

gregallensworth commented 4 years ago

The tool is coming together. Output is as follows.

00
    Found in 1 records
    cartodb_id = 2248779
00000
    Found in 1 records
    cartodb_id = 2713581
013
    Found in 1 records
    cartodb_id = 2172791
19720
    Found in 1 records
    cartodb_id = 2713094
2015
    Found in 1 records
    cartodb_id = 2686628
72000
    Found in 1 records
    cartodb_id = 2492034
994
    Found in 1 records
    cartodb_id = 2276936
99999
    Found in 1 records
    cartodb_id = 2314383
amula
    Found in 1 records
    cartodb_id = 2712756
Chevy
    Found in 1 records
    cartodb_id = 2722282
DUAL
    Found in 1 records
    cartodb_id = 2726608
???MBU
    Found in 1 records
    cartodb_id = 2620418
I-hau
    Found in 1 records
    cartodb_id = 2726866
PALLE
    Found in 1 records
    cartodb_id = 2716236
PASSENGER VEHICLE,PASSENGER VEHICLE
    Found in 1 records
    cartodb_id = 1753935
pay l
    Found in 1 records
    cartodb_id = 2727483
pedic
    Found in 1 records
    cartodb_id = 2729745
PICK-UP TRUCK,SPORT UTILITY / STATION WAGON
    Found in 1 records
    cartodb_id = 1753115
RD BL
    Found in 1 records
    cartodb_id = 2715086
red,
    Found in 1 records
    cartodb_id = 2163504
RV/VA
    Found in 1 records
    cartodb_id = 2722904
SKIDS
    Found in 1 records
    cartodb_id = 2721988
SMALL COM VEH(4 TIRES) 
    Found in 17 records
    cartodb_id = 2214456
    cartodb_id = 2214893
    cartodb_id = 2214647
    cartodb_id = 2214452
    cartodb_id = 2214691
    cartodb_id = 2214478
    cartodb_id = 2214709
    cartodb_id = 2214495
    cartodb_id = 2214695
    cartodb_id = 2215102
    cartodb_id = 2214721
    cartodb_id = 2214628
    cartodb_id = 2215134
    cartodb_id = 2215011
    cartodb_id = 2214911
    cartodb_id = 2214912
    cartodb_id = 2214683
T880
    Found in 1 records
    cartodb_id = 2725190
towe
    Found in 1 records
    cartodb_id = 2716595
TTRAI
    Found in 1 records
    cartodb_id = 2723886
Unk,
    Found in 1 records
    cartodb_id = 2302897
VENDO
    Found in 1 records
    cartodb_id = 2731905
gregallensworth commented 4 years ago

I have cleaned up the three comma-join items described above.

danrademacher commented 4 years ago

For the record, the "tool" for Christine to manage these going forward is this fancied up Google sheet wired into CARTO API, writes an INSERT statement and has docs on where she can paste that to add new records toto the Crosswalk table inCARTO as they come up.

https://docs.google.com/spreadsheets/d/1uBmIyeleBGguLqDczxX525FNfuuDkfmJsS-i_ZFFF5g/edit#gid=0