episphere / connect

Connect API for DCEG's Cohort Study
10 stars 5 forks source link

NORC Eligible Participant API Error #649

Closed Davinkjohnson closed 1 month ago

Davinkjohnson commented 1 year ago

From the NORC team:

The API pull for payment-eligible participants failed when it ran on Saturday, 6/3 at 3pm Central Time. The Connect System responded to our ServiceNow system with a 500 error message.

The job was completed successfully yesterday (6/4) and today (6/5). Please let us know if we can help troubleshoot why this error occurred or if we can make any updates on our side to prevent future errors.

-- The last time this error occurred it was related the API running too closely to the notifications job, but that should no longer be an issue as the notifications run at 2pm CT (3pm ET)

we-ai commented 1 year ago

The called API is participantsEligibleForIncentive, right?

Davinkjohnson commented 1 year ago

"participantsEligibleForIncentive" yes, that is the API they are calling.

we-ai commented 1 year ago

Here are details from logs of the function run. 2023-06-03 16:00:02.177 (EDT), request /participantsEligibleForIncentive?limit=500&page=1&round=baseline was sent from ServiceNow, and the server returned with error code 500, and error message as "The request was aborted because there was no available instance. Additional troubleshooting documentation can be found at: https://cloud.google.com/functions/docs/troubleshooting#scalability".

GCP documentation has details about this error, for example:

So the error was a result of high inbound traffic to the API. It was not from a bug in our codebase. To avoid this error in the future, it's suggested to "ramps traffic up gradually over the course of a minute", or when code 500 is returned, re-send the request to the API after 1 or 2 minutes to retrieve data.

Davinkjohnson commented 1 year ago

Thanks for looking into this and finding these details. Were you able to see what the other inbound traffic at that time was?

we-ai commented 1 year ago

I didn't see other inbound traffic to this API besides the single request.

we-ai commented 1 year ago

At 4:00, there were active calls to other APIs. Participant data were being submitted to app, and getParticipants was also running.

Davinkjohnson commented 1 month ago

closing, we have not seen this in over a year.