Closed yinleon closed 1 year ago
likely related to https://github.com/biglocalnews/usps/issues/14
Aaron used a random city generator to test cities and he recorded which cities seem to throwing this error usps test 3-15.xlsx
Spotted in Eugene and Portland Oregon today!
I was able to get this running locally and replicate the issue, here's what I'm seeing in the docker console scroll:
usps-db-1 | 2023-04-10 20:25:43.844 UTC [23858] ERROR: function usps_sample(geometry, unknown, unknown) does not exist at character 440
usps-db-1 | 2023-04-10 20:25:43.844 UTC [23858] HINT: No function matches the given name and argument types. You might need to add explicit type casts.
usps-db-1 | 2023-04-10 20:25:43.844 UTC [23858] STATEMENT:
usps-db-1 | WITH bounds AS (
usps-db-1 | SELECT the_geom g FROM tiger.place WHERE gid = $1
usps-db-1 | )
usps-db-1 | SELECT
usps-db-1 | s.unit,
usps-db-1 | s.number,
usps-db-1 | s.street,
usps-db-1 | s.city,
usps-db-1 | s.district,
usps-db-1 | s.region,
usps-db-1 | s.postcode,
usps-db-1 | s.lon,
usps-db-1 | s.lat,
usps-db-1 | s.tract_id,
usps-db-1 | s.blkgrpce
usps-db-1 | FROM bounds b, usps_sample(b.g, $2, $3) s
usps-db-1 |
usps-api-1 | INFO: 172.19.0.4:59330 - "POST /sample HTTP/1.0" 500 Internal Server Error
I'm not sure why the function created by sample-fn.sql
didn't stick around (after running init_addr_data.sh OR
to get my Oregon Census data setup) but after running cat sample-fn.sql | psql
from the docker console again I got a (maybe) more helpful error message:
usps-db-1 | 2023-04-10 20:38:58.927 UTC [27854] ERROR: syntax error at or near "a" at character 158
usps-db-1 | 2023-04-10 20:38:58.927 UTC [27854] QUERY: WITH sample AS ( SELECT random(), a.hash, a.point, a.unit, a.number, a.street, a.city, a.district, a.region, a.postcode, a.tract_id, a.blkgrpce FROM oa_"or" a WHERE a.tract_id = $1) INSERT INTO local_addrs(r, hash, point, unit, number, street, city, district, region, postcode, tract_id, blkgrpce) SELECT s.* FROM sample s WHERE ST_Contains($2, s.point)
usps-db-1 | 2023-04-10 20:38:58.927 UTC [27854] CONTEXT: PL/pgSQL function usps_sample(geometry,integer,double precision) line 36 at EXECUTE
usps-db-1 | 2023-04-10 20:38:58.927 UTC [27854] STATEMENT:
usps-db-1 | WITH bounds AS (
usps-db-1 | SELECT the_geom g FROM tiger.place WHERE gid = $1
usps-db-1 | )
usps-db-1 | SELECT
usps-db-1 | s.unit,
usps-db-1 | s.number,
usps-db-1 | s.street,
usps-db-1 | s.city,
usps-db-1 | s.district,
usps-db-1 | s.region,
usps-db-1 | s.postcode,
usps-db-1 | s.lon,
usps-db-1 | s.lat,
usps-db-1 | s.tract_id,
usps-db-1 | s.blkgrpce
usps-db-1 | FROM bounds b, usps_sample(b.g, $2, $3) s
usps-db-1 |
@dphiffer thanks for the stack, that was super helpful, I hadn't had time to reproduce locally.
Just made a fix / deployed it for addresses in Indiana & Oregon, but the Arizona issue appears to be something different. Keeping the issue open, I'll try to look into that some more.
I added Sentry (#26) to our production deployment and was able to capture the stack trace for Glendale, Arizona
. It's unrelated to the Oregon/Indiana issue:
(sqlalchemy.dialects.postgresql.asyncpg.InternalServerError) <class 'asyncpg.exceptions.InternalServerError'>: lwgeom_intersection_prec: GEOS Error: TopologyException: Input geom 0 is invalid: Holes are nested at -112.409468 33.529843
[SQL:
WITH bounds AS (
SELECT the_geom g FROM tiger.place WHERE gid = %s
)
SELECT
s.unit,
s.number,
s.street,
s.city,
s.district,
s.region,
s.postcode,
s.lon,
s.lat,
s.tract_id,
s.blkgrpce
FROM bounds b, usps_sample(b.g, %s, %s) s
]
[parameters: (2290, 200, -1.0)]
(Background on this error at: https://sqlalche.me/e/14/2j85)
That sounds more like the geometry is invalid for some reason. I will look more into that.
The root cause is that TIGER doesn't provide valid geometries in some cases. Looks like there are very few in the place
table, which is the most likely spot for them since places can have very complicated geometries.
addresses=# with validation as (select gid, name, ST_IsValidReason(the_geom) reason from place) select gid, name, reason from validation where reason <> 'Valid Geometry'
;
gid | name | reason
-------+---------------+----------------------------------------
2290 | Glendale | Holes are nested[-112.40957 33.529845]
8701 | Joliet | Holes are nested[-88.074056 41.510616]
26151 | Knoxville | Holes are nested[-84.082463 35.893129]
19737 | Wesley Chapel | Holes are nested[-80.682658 34.989184]
(4 rows)
I believe we can fix this fairly easily in the ingest scripts to repair the geometries if necessary. I can run queries on the server to find / repair the invalid geometries today.
Sentry issue: USPS-API-1
JSON parsing error for certain queries. Is this because there are no addresses there?
This also occurs in Chicago, which I am sure does have addresses.