Open randallhammond-whs opened 2 months ago
Hi, @randallhammond-whs. If you want to contribute a PR that would make it attempt to reconnect, I'm sure we'd be happy to accept it.
As you may know, the persistent storage support in gearmand has been quasi-deprecated in favor of a design pattern where job persistence is implemented by workers. It scales better and is generally more robust. (There are two frameworks for implementing such a system, one is called Gearstore and another is called Garavini. You might want to look into them. It's also straightforward to implement your own persistent storage tasks once you understand the design pattern.) I say "quasi-deprecated" as they are not officially deprecated and we still accept PRs for maintaining the persistent storage layers, but we've stopped active development on them.
I did not know this. I'll checkout the links. If I can find the time, I'll try and post a PR.
We have gearman and a postgres persistence DB running in different pods in kubernetes. If for some reason, the postgres DB gets restarted, gearman's DB connection gets broken. From this point on, gearman just logs PQExec:no connection to server for all the background jobs. It doesn't seem to check the connection and reconnect if it is broken. The only way to fix this is to restart gearman which will create a new connection, but this is not ideal if there are other jobs in the internal queue.