department-of-veterans-affairs / veterans-employment-center

Veterans Employment Center
https://www.vets.gov/veterans-employment-center
5 stars 4 forks source link

Retry requests to Jobs API on timeout #359

Closed jkassemi closed 8 years ago

jkassemi commented 8 years ago

We've experienced issues with hanging requests to the Heroku-based Jobs API. The ELB we've configured for VEC will return a timeout to the client after 60s. It doesn't appear that the Jobs API is undergoing any performance difficulties for these requests, though more in depth investigation may be required. This (somewhat hacky) solution will

in the hopes of providing a more consistent and performant experience to the user.

Further investigation into the cause is on hold until the Jobs API is migrated into GC (https://github.com/department-of-veterans-affairs/devops/issues/570)

jkassemi commented 8 years ago

This (and most every) external integration is a good candidate for a circuit breaker, but performance of the Jobs API appears stable under expected load, so this will be of more immediate benefit.

saneshark commented 8 years ago

LGTM, simple and well tested. I think elsewhere (prescriptions and messaging) we went with 10 second timeout which might be a bit high, but we based that off of the SLA that those respective APIs have stated in their specs. I wonder if 3 seconds might be too short? Though I imagine this can very easily be tweaked as required.

jkassemi commented 8 years ago

Thanks @saneshark! The purpose is primarily to give us more insight into the failure and better alerts, and a 10 second timeout does that more safely than the 3 second timeout.

We'll address additional performance improvements and tweak this value when the Jobs API lives on the same network.