EastCoastGreenwayAlliance / ecg-map

Interactive map and trip planner for the ECGA
https://map.greenway.org
7 stars 0 forks source link

0 records in routing table, manifests as CORS issue #96

Closed danrademacher closed 4 years ago

danrademacher commented 5 years ago

Niles reports that searching is not working and I have confirmed that though geocoding is working, the location fails to move to next step of “select this as starting point”

Here’s the error (using TeamViewer on my phone so apologies for small screenshot): 8F219381-935E-4672-87F8-EB3CCAF54D66

Since we have made zero code changes, could this be either some change at client DNS or a new browser restriction of some sort?

danrademacher commented 5 years ago

To repro, load map.greenway.org and try any search

danrademacher commented 5 years ago

Asked Niles if they made any DNS changes. Response:

It was working fine on Sunday. I don’t know of any upstream changes. Nothing that we initiated pretty sure.

Weird!

danrademacher commented 5 years ago

New update from Niles:

A call just came in from our website host, I think something did happen upstream of map.greenway. Trying to get more intel.

gregallensworth commented 5 years ago

Hey hey. The CORS issue was secondary. The real message is immediately above: the internal server error. The error output does not include CORS headers (since it's not meant as a data payload) thus causing that red herring.

The real issue, was that one of the sync runs from CARTO must have glitched out. The table structure came over, but there were 0 line records in the table. As such, the search for nearest point had nothing at all, which was an error condition not handled at the server level. Later sync runs were not working, since the DB table already existed, and the script was attempting to create the spatial index as if the table were brand new.

The immediate fix was to drop the tables, then run the sync. This solved the problem immediately, in about one minute.

The underlying issue was a rare glitch in the Carto-to-DB process, in which it just plain flaked out, perhaps an Internet timeout for a second during the process. No further info is available, but this does seem quite rare so far.

Potential room for improvement:

danrademacher commented 5 years ago

From Niles:

This issue seems to be back. Type in a city and the search result function doesn't respond.

Seems like we need to take the approach in your last comment to of dropping and recreating to avoid these outages.

I also wonder though if there’s some problem in the source table at Carto that we need to address

gregallensworth commented 4 years ago

I dropped the tables and re-ran the script, and it worked A-OK to restore service.

gregallensworth commented 4 years ago

Item 1: If somehow no points are found at all, an error message should be displayed so as not to confuse users with Nothing and to confuse investigators with spurious CORS errors which distract from the real problem.

gregallensworth commented 4 years ago

Server side now hands back a proper error condition when an IndexError happens, and client-side now generates and handles ROUTING_LOCATION_ERROR states which results from that error condition.

image

gregallensworth commented 4 years ago

Item 2: CARTO interruption.

Both times out of the two times, this has happened in the wee hours of Sunday. We do not really need hourly updates to run on Saturday and Sunday when nobody has made changes (and when they could manually trigger an update, under that condition).

As such, it seems more expedient to rework the cronjob to bail on Saturday and Sunday, so as not to run afoul of this seemingly-recurring Sunday night phenomenon. I have done so.

If this proves unsatisfactory, we can see next time this happens what exactly the failure mode was, and how best to work around it.

danrademacher commented 4 years ago

I also just added a StatusCake test that pings this URL: https://router.greenway.org/nearestpoint/?lat=37.540726&lng=-77.436050 every 15 minutes and if the string wanted_lng is NOT in the result, it will report the service as down. I'll check it in a week and make sure the check is working. I briefly set it to report as up if it found that string, which worked, and then report as down if it found that string, and that sent a down alert, so it seems like the string match is working.