Closed rdmulford closed 2 years ago
hmm, seems somewhat related to https://github.com/chdsbd/kodiak/issues/694
maybe an issue with the timeout set in the redis host?
Thanks @sbdchd. I did look at #694 in our initial investigations. The resolution there seemed to be set the timeout to 0, which we confirmed is our timeout setting on our redis instance.
Given that this is specifically happening to us between 0.48
and 0.49
, we are suspecting this is something introduced from the Kodiak side in one of these commits https://github.com/chdsbd/kodiak/compare/v0.48.0...v0.49.0
@rdmulford Looking at the diff you linked to, I don't see any Redis related changes.
EOF errors aren't something we've encountered on the hosted Kodiak GitHub App. I'm guessing your connections between your Kodiak container and Redis are getting dropped. Either by Redis timeouts or something in front of Redis.
Thanks @chdsbd digging in deeper it looks like the issue is related to how the client is connecting to our loadbalancer for the redis instance, as connecting directly to the redis instance IP makes the issue go away.
After updating our self hosted kodiak instance to release 0.49 we've been seeing unusual log messages looking like:
which seem to spike up every 15 minutes. The following graph shows log counts matching these errors:
We did not see these errors in version
0.48
we haven't changed any settings/configs in our redis instance and this is what our timeout settings look like:
The app still seems to work (its able to merge pull requests etc.) but im cautious to deploy this to our prod instance until we understand what this error is. Any help understanding what is happening here would be greatly appreciated!