Closed kvigen closed 10 years ago
@jefff assigned to you
Is there some benefit to batching this way vs accumulating a set # of operations? This seems more complex.
I did it because I thought it would simulate our test workloads a bit better. An admittedly contrived example is that you the oplog replay has 20 requests / second and the batch size is 20. Accumulating all of them at once means that you replay all 20 operations once a second rather than 1 operation every 50ms. In other words, you only batch when you're bottlenecked on the round trip time.
Not sure if that's an important goal.
That makes a lot of sense. Can we clarify that it works that way in the comments? Specifically, this was the part I didn't realize: "you only batch when you're bottlenecked on the round trip time."
(now I'll step out and leave this to @jefff to review)
Yes, good call. I'll add some more detail to the comments.
lgtm.
Before this change when replaying the oplog we maxed out at ~150 requests per second. We think the bottleneck was the round trip time between the database and the oplog-replay script. To address that issue we're supporting batching oplog requests.
Conceptually we still have one goroutine responsible for putting the ops in the "channel" to be processed, but now the goroutine that actual calls mongo to apply to operation applies everything currently in the queue at once.
It should be noted that oplog requests will only be batched if the replay is bottlenecked on round trip time to the db. Otherwise the requests will be sent as soon as they should.