criteo / kafka-sharp

A C# Kafka driver
Apache License 2.0
110 stars 44 forks source link

Race condition in Accumulator<>? #9

Closed FrancoisBeaune closed 7 years ago

FrancoisBeaune commented 7 years ago

We're seeing occasional crashes in Accumulator.SignalNewBatch() which is called by Accumulator.Tick() when a timer fires.

The crash takes the form of a NullReferenceException on _timer.Dispose() (but _timer itself isn't null).

In case that's helpful:

Edit: timeWindow to 0 ms is probably a very bad idea as that will consumes all resources of one core, but still, it shouldn't crash, obviously.

sdanzan commented 7 years ago

Yes, setting a zero time window won't work. I agree it shouldn't crash at Dispose time though, I will correct that (and yes, it's a bad idea anyway).

What semantic are you trying to achieve by setting the time window to zero ?

FrancoisBeaune commented 7 years ago

Thanks for the quick reply. I'm trying to speed up the consumption of messages at random offsets (as described in https://github.com/criteo/kafka-sharp/issues/7) so I'm playing with settings.

Obviously setting timeWindow to 0 is a bad idea, lesson learned :)

sdanzan commented 7 years ago

You can try setting the ConsumeBatchSize to 1 and ConsumeBufferingTime to a high value (or -1/Infinite) to do that. However it's probably not a good idea if you're reading multiple partitions and/or multiple topics from one ClusterClient (it will only really speed up the sending of the first Consume message in that case).

FrancoisBeaune commented 7 years ago

Thanks for the tips, will try that! We're actually dedicating one ClusterClient per topic (and we only have one partition... our use case is a bit... particular. Will be happy to elaborate if you're interested).

FrancoisBeaune commented 7 years ago

Setting ConsumeBatchSize to 1 (disable batching) and ConsumeBufferingTime to TimeSpan.FromMilliseconds(-1) works very well, thanks for the tip.

Performances are still very much dependent on FetchMessageMaxBytes unfortunately. Investigating...

sdanzan commented 7 years ago

Should be fixed in ff8b3062589a722eabca1b9a10c2d77962347abc