nicktacular / php-mongo-session

A PHP session handler with a Mongo DB backend.
MIT License
18 stars 6 forks source link

uncaught connection exception #20

Open rocksfrow opened 9 years ago

rocksfrow commented 9 years ago

So I have a simple replicaSet -- the typical/minimal 3-node (2 nodes + arbiter). I am getting fatal errors for 15-30 seconds after updates/manual failovers, I assume during the election/reelection process.

Here is the stack trace.

[13-Feb-2015 03:29:28 UTC] PHP Fatal error:  Uncaught exception 'MongoConnectionException' with message 'No candidate servers found' in /usr/share/php-share/php/php-mongo-session/MongoSession.php:204
#0 /usr/share/php-share/php/php-mongo-session/MongoSession.php(204): MongoClient->__construct('mongodb://10.10...', Array)
#1 /usr/share/php-share/php/php-mongo-session/MongoSession.php(149): MongoSession->__construct()
#2 /usr/share/php-share/php/php-mongo-session/MongoSession.php(165): MongoSession::instance()
#3 /usr/share/php-share/php/php-mongo-session/init.php(63): MongoSession::init()
  thrown in /usr/share/php-share/php/php-mongo-session/MongoSession.php on line 204

So basically this is occurring for a 15-30 second period whenever I my stack is resetting after an upgrade/manual failover of master before an upgrade. I think this is just the delay from the mongo stack electing/reelecting the master perhaps.

In either case -- this line is the culprit:

        $this->conn->connect();

We aren't doing any exception catching on this connection attempt at all -- so during this period all of my sites end up with a blank page/fatal error.

@nicktacular do you experience downtime of any sort like this when performing minor updates? (not talking about upgrades)

I am thinking in this scenario we would push out the configured PHP timeout by X seconds, and then wait for X seconds and retry the connection attempt after X seconds (in the catch block wrapped around the connection attempt).

@nicktacular curious of your thoughts/suggested solutions -- but I'll probably do something locally and test because I have a minor update I need to do on my cluster so will be a good test. Please let me know your practice/experience with installing updates on your cluster.

rocksfrow commented 9 years ago

I could catch this exception in my init file -- but we probably want to catch it within the class itself to try and prevent failed requests during updates.