When user clicks refresh on My Roles, after about 30 seconds a 403 response comes back and the session is dead. To continue, user has to log back in. Refreshing my roles page shows the expected updates.
Not seeing this happen in QA, where the response is considerably faster. However, QA does not have the same volume of traffic. To construct the response we make up to 3 requests to the GitHub API for each repo the user has access to. The requests to the GitHub API are being handled by the web servers, which are handling all the inbound users requests.
Solution
After a good bit of digging around, it turns out that nginx is killing the session if the upstream does not respond within 30 seconds. Increased the timeout to 300 seconds. The response time is slow, but at least it's not timing out and killing the session.
Problem
When user clicks refresh on My Roles, after about 30 seconds a 403 response comes back and the session is dead. To continue, user has to log back in. Refreshing my roles page shows the expected updates.
Not seeing this happen in QA, where the response is considerably faster. However, QA does not have the same volume of traffic. To construct the response we make up to 3 requests to the GitHub API for each repo the user has access to. The requests to the GitHub API are being handled by the web servers, which are handling all the inbound users requests.
Solution
After a good bit of digging around, it turns out that nginx is killing the session if the upstream does not respond within 30 seconds. Increased the timeout to 300 seconds. The response time is slow, but at least it's not timing out and killing the session.