gearman / gearmand

http://gearman.org/
Other
727 stars 138 forks source link

Gearman doesn't check for a broken postgres persistence DB connection #390

Open randallhammond-whs opened 2 months ago

randallhammond-whs commented 2 months ago

We have gearman and a postgres persistence DB running in different pods in kubernetes. If for some reason, the postgres DB gets restarted, gearman's DB connection gets broken. From this point on, gearman just logs PQExec:no connection to server for all the background jobs. It doesn't seem to check the connection and reconnect if it is broken. The only way to fix this is to restart gearman which will create a new connection, but this is not ideal if there are other jobs in the internal queue.

esabol commented 2 months ago

Hi, @randallhammond-whs. If you want to contribute a PR that would make it attempt to reconnect, I'm sure we'd be happy to accept it.

As you may know, the persistent storage support in gearmand has been quasi-deprecated in favor of a design pattern where job persistence is implemented by workers. It scales better and is generally more robust. (There are two frameworks for implementing such a system, one is called Gearstore and another is called Garavini. You might want to look into them. It's also straightforward to implement your own persistent storage tasks once you understand the design pattern.) I say "quasi-deprecated" as they are not officially deprecated and we still accept PRs for maintaining the persistent storage layers, but we've stopped active development on them.

randallhammond-whs commented 2 months ago

I did not know this. I'll checkout the links. If I can find the time, I'll try and post a PR.