e-mission / e-mission-docs

Repository for docs and issues. If you need help, please file an issue here. Public conversations are better for open source projects than private email.
https://e-mission.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
15 stars 34 forks source link

UACE and eGRID lookups can fail on coasts #1086

Open JGreenlee opened 2 weeks ago

JGreenlee commented 2 weeks ago

Testing the new calculations on some real data today and found a peculiar situation where the UACE code lookup failed: a train trip that began near ferry station in San Francisco

DEBUG:root:Getting mode footprint in year 2023, coords [-122.39373566036205, 37.79639539828247], UACE None, and modes ['LR', 'HR', 'YR', 'CR']
DEBUG:root:Missing UACE, trying to geocode coords [-122.39373566036205, 37.79639539828247]
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): geocoding.geo.census.gov:443
DEBUG:urllib3.connectionpool:https://geocoding.geo.census.gov:443 "GET /geocoder/geographies/coordinates?x=-122.39373566036205&y=37.79639539828247&benchmark=Public_AR_Current&vintage=Census2020_Current&layers=87&format=json HTTP/1.1" 200 4725
ERROR:root:Geocoding response did not contain UA for coords [-122.39373566036205, 37.79639539828247] in year 2023: {'result': {'geographies': {'2018 State Legislative Districts - Upper': [{'POP100': 1003944, 'GEOID': '06011', 'CENTLAT': '+37.7526553', 'AREAWATER': 510554858, 'STATE': '06', 'BASENAME': '11', 'OID': '212704690192924', 'LSADC': 'LU', 'SLDU': '011', 'FUNCSTAT': 'N', 'INTPTLAT': '+37.7268594', 'NAME': 'State Senate District 11', 'OBJECTID': 831, 'CENTLON': '-122.6747042', 'LSY': '2018', 'AREALAND': 155473439, 'INTPTLON': '-122.5058485', 'HU100': 448727, 'MTFCC': 'G5210', 'LDTYP': 'O'}], 'States': [{'STATENS': '01779778', 'POP100': 39538223, 'GEOID': '06', 'CENTLAT': '+37.1547616', 'AREAWATER': 20291796519, 'STATE': '06', 'BASENAME': 'California', 'STUSAB': 'CA', 'OID': '2747018475066', 'LSADC': '00', 'FUNCSTAT': 'A', 'INTPTLAT': '+37.1551773', 'DIVISION': '9', 'NAME': 'California', 'REGION': '4', 'OBJECTID': 36, 'CENTLON': '-119.5277735', 'AREALAND': 403673270110, 'INTPTLON': '-119.5434183', 'HU100': 14392140, 'MTFCC': 'G4000', 'UR': 'M'}], 'Combined Statistical Areas': [{'POP100': 9714023, 'GEOID': '488', 'CENTLAT': '+37.6648454', 'AREAWATER': 3995991273, 'BASENAME': 'San Jose-San Francisco-Oakland, CA', 'OID': '2617013782255227', 'LSADC': 'M0', 'FUNCSTAT': 'S', 'INTPTLAT': '+37.6583275', 'NAME': 'San Jose-San Francisco-Oakland, CA CSA', 'OBJECTID': 89, 'CSA': '488', 'CENTLON': '-121.7317408', 'INTPTLON': '-121.7156006', 'AREALAND': 35146149769, 'HU100': 3606733, 'MTFCC': 'G3100'}], 'County Subdivisions': [{'COUSUB': '90734', 'POP100': 131054, 'GEOID': '0607590734', 'CENTLAT': '+37.8046095', 'AREAWATER': 125090244, 'STATE': '06', 'BASENAME': 'Downtown-Northeast Neighborhoods-Treasure Island', 'OID': '2767015925627468', 'LSADC': '22', 'FUNCSTAT': 'S', 'INTPTLAT': '+37.8620462', 'NAME': 'Downtown-Northeast Neighborhoods-Treasure Island CCD', 'OBJECTID': 35291, 'CENTLON': '-122.3752490', 'COUSUBCC': 'Z5', 'AREALAND': 9981729, 'INTPTLON': '-122.4195821', 'HU100': 79100, 'MTFCC': 'G4040', 'COUSUBNS': '02804908', 'UR': 'M', 'COUNTY': '075'}], 'Incorporated Places': [{'DISP_CLR': 1, 'NECTAPCI': 'N', 'POP100': 873965, 'GEOID': '0667000', 'CENTLAT': '+37.7600860', 'AREAWATER': 479701988, 'BASENAME': 'San Francisco', 'STATE': '06', 'OID': '27870355730719', 'LSADC': '25', 'INTPTLAT': '+37.7272391', 'PLACE': '67000', 'FUNCSTAT': 'A', 'NAME': 'San Francisco city', 'OBJECTID': 14512, 'PLACECC': 'C1', 'CENTLON': '-122.6941272', 'CBSAPCI': 'Y', 'AREALAND': 120940745, 'HU100': 406628, 'INTPTLON': '-123.0322294', 'PLACENS': '02411786', 'MTFCC': 'G4110', 'UR': 'M'}], 'Counties': [{'POP100': 873965, 'GEOID': '06075', 'CENTLAT': '+37.7600860', 'AREAWATER': 479701988, 'STATE': '06', 'BASENAME': 'San Francisco', 'OID': '27570355701186', 'LSADC': '06', 'FUNCSTAT': 'C', 'INTPTLAT': '+37.7272391', 'NAME': 'San Francisco County', 'OBJECTID': 798, 'CENTLON': '-122.6941272', 'COUNTYCC': 'H6', 'COUNTYNS': '00277302', 'AREALAND': 120940745, 'INTPTLON': '-123.0322294', 'HU100': 406628, 'MTFCC': 'G4020', 'UR': 'M', 'COUNTY': '075'}], '2018 State Legislative Districts - Lower': [{'POP100': 524761, 'GEOID': '06017', 'CENTLAT': '+37.7870010', 'SLDL': '017', 'AREAWATER': 130843518, 'STATE': '06', 'BASENAME': '17', 'OID': '213704690192968', 'LSADC': 'L3', 'FUNCSTAT': 'N', 'INTPTLAT': '+37.7860679', 'NAME': 'Assembly District 17', 'OBJECTID': 582, 'CENTLON': '-122.3853540', 'LSY': '2018', 'AREALAND': 62179315, 'INTPTLON': '-122.3863351', 'HU100': 257920, 'MTFCC': 'G5220', 'LDTYP': 'O'}], '116th Congressional Districts': [{'POP100': 771010, 'GEOID': '0612', 'CENTLAT': '+37.7853458', 'CDSESSN': '116', 'AREAWATER': 212146892, 'STATE': '06', 'BASENAME': '12', 'OID': '211704492478597', 'LSADC': 'C2', 'FUNCSTAT': 'N', 'INTPTLAT': '+37.7855141', 'NAME': 'Congressional District 12', 'OBJECTID': 327, 'CENTLON': '-122.4347573', 'AREALAND': 100426943, 'INTPTLON': '-122.4340005', 'HU100': 370641, 'CD116': '12', 'MTFCC': 'G5200'}], 'Census Tracts': [{'POP100': 3234, 'GEOID': '06075010500', 'CENTLAT': '+37.7997508', 'AREAWATER': 501830, 'STATE': '06', 'BASENAME': '105', 'OID': '20770355746787', 'LSADC': 'CT', 'FUNCSTAT': 'S', 'INTPTLAT': '+37.8026835', 'NAME': 'Census Tract 105', 'OBJECTID': 30298, 'TRACT': '010500', 'CENTLON': '-122.3971391', 'AREALAND': 676089, 'INTPTLON': '-122.3990500', 'HU100': 2145, 'MTFCC': 'G5020', 'UR': 'U', 'COUNTY': '075'}], 'Census Blocks': [{'GEOID': '060750105002000', 'STATE': '06', 'BASENAME': '2000', 'LSADC': 'BK', 'INTPTLAT': '+37.7981692', 'OBJECTID': 7232723, 'BLKGRP': '2', 'AREALAND': 0, 'HU100': 0, 'VINTAGE': '70', 'LWBLKTYP': 'W', 'UR': 'R', 'COUNTY': '075', 'SUFFIX': '', 'TABBLKSUFX2': '', 'POP100': 0, 'CENTLAT': '+37.7981692', 'BLOCK': '2000', 'AREAWATER': 273459, 'OID': '210701000438470', 'FUNCSTAT': 'S', 'NAME': 'Block 2000', 'TRACT': '010500', 'CENTLON': '-122.3928961', 'ACT': '', 'INTPTLON': '-122.3928961', 'MTFCC': 'G5040'}]}, 'input': {'vintage': {'isDefault': False, 'id': '420', 'vintageName': 'Census2020_Current', 'vintageDescription': 'Census2020 Vintage - Current Benchmark'}, 'location': {'x': -122.39373566036205, 'y': 37.79639539828247}, 'benchmark': {'isDefault': True, 'benchmarkDescription': 'Public Address Ranges - Current Benchmark', 'id': '4', 'benchmarkName': 'Public_AR_Current'}}}}
DEBUG:root:Getting mode footprint for transit modes ['LR', 'HR', 'YR', 'CR'] in year 2023 and UACE None
WARNING:root:ntd data not available for 2023. Trying 2022.

Checking the URL manually does show that there is no information for "Urban Areas" given https://geocoding.geo.census.gov/geocoder/geographies/coordinates?x=-122.39373566036205&y=37.79639539828247&benchmark=Public_AR_Current&vintage=Census2020_Current&layers=87&format=json

A few meters inland and there would have been no problem: https://geocoding.geo.census.gov/geocoder/geographies/coordinates?x=-122.394&y=37.79639539828247&benchmark=Public_AR_Current&vintage=Census2020_Current&layers=87&format=json

I believe this can happen for eGRID as well because the shapefiles only cover land. Beaches, bridges over lakes/bays, or potentially anywhere on a coast may be unreliable. I think we can reduce the likelihood of this happening by attempting both the start and end locations of the trip.

shankari commented 2 weeks ago

The first link does have a Combined Statistical Areas section whose name is "San Jose-San Francisco-Oakland, CA CSA". But there does not appear to be any overlap between the numeric values in "Combined Statistical Areas" and "Urban Areas"