stripe-archive / mosql

MongoDB → PostgreSQL streaming replication
MIT License
1.63k stars 225 forks source link

Updating to latest Mongo driver #107

Open bikashmishra opened 8 years ago

bikashmishra commented 8 years ago

First off, thank you for the great tool. This is not an issue per se, but was wondering if there are any plans to support the latest mongo driver. The reason I ask is 'readPreference=secondary' seems to have no effect on our DB with the current 1.12.3 driver (although it should following 1.8.3) and there is very little information/debug to be done over the older version.

Also on that note, has anyone been able to use the option successfully or otherwise with mosql? Would greatly appreciate learning about any experience.

nelhage commented 8 years ago

We've definitely used that with success on 1.8.3. Have you verified that with 1.12.3 you're still hitting the primary (via tcpdump or db.currentOp() or whatever…)

bikashmishra commented 8 years ago

Thanks Nelson. We verified that we are reading off of primary with the current setup. Since 1.8.3 is successfully tested, would mosql have any issues if we changed the dependency and reverted to the older version?

nelhage commented 8 years ago

have you patched it in your environment and verify that it works on 1.8 for you?

bikashmishra commented 8 years ago

Been off for a while, but have had a chance to dig deeper since. The versions being used are : mongo-driver (1.12.3), mosql (0.4.3), MongoDB 3.0 I started off by testing the mongo-driver from a stand alone script. That worked fine i.e. read from secondary. Next I went deeper into the MoSQL code. Things are as expected until the function initial_import is called in streamer.rb. Before the line collections = db.collections.select { |c| spec.key?(c.name) } client instance is secondary (checked config via @mongo['admin'].command(:ismaster => 1) ). However after this line is executed, the instance switches to primary.

bikashmishra commented 8 years ago

Update on the issue. This is what we found out:

The db.collections command results in a listCollections call. This command is only allowed to be run on the primary. So the mongo driver routes this command to the primary. This happens in both old and the new drivers. However on 1.12.3 the driver continues to be struck to the primary and reads from there. For the 1.12.x drivers: if no listCollections or similar commands are executed on the connection before making a read query, the driver correctly reads from the secondary.

So it would seem to be a mongo driver issue. This issue has been fixed in the latest versions. So I see 2 ways to resolve this: a) update mosql to use latest drivers b) use an older driver with mosql (unfortunately not an option if using Mongo3.0 as 1.12 is the earliest driver supporting 3.0)

nelhage commented 8 years ago

Ugh, that's exciting. Thanks for digging into this. I probably don't have time to get things working on 2.x, so it sounds like rolling back is probably the best bet.

ptrikutam commented 8 years ago

@bikashmishra did you figure out a workaround? Or did you just end up rolling back? I've got a related issue with my project that employs the use of mosql.

bikashmishra commented 8 years ago

@ptrikutam No workaround yet. We are living on the edge by forcing read from secondary (i.e. if secondary switches or goes down, we have to manually change things). Rolling back is not an option if you are using Mongo 3.0 as the older drivers are not compatible

ptrikutam commented 8 years ago

Got it, thanks.

johnnason commented 6 years ago

This got a little tangled, but I do have a PR out for mongoriver (https://github.com/stripe/mongoriver/pull/19) for the base driver work, and https://github.com/stripe/mosql/pull/132 for the adoption. Travis will continue to fail because the new .5 version of mongoriver would need to be merged and pushed to rubygems. All specs and functional testing of replication has been working locally.