akka / alpakka-kafka

Alpakka Kafka connector - Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.
https://doc.akka.io/libraries/alpakka-kafka/current/
Other
1.42k stars 386 forks source link

akka.kafka.consumer.stop-timeout is used as delay in shutdown scheduler call, causing timeouts on shutdown #977

Closed cSNiHab closed 5 years ago

cSNiHab commented 5 years ago

Versions used

com.typesafe.akka akka-stream-kafka 1.1.0

Issue

When calling akka.kafka.scaladsl.Consumer.Control#shutdown the system eventually tries to stop the internal consumerActor in akka.kafka.internal.SingleSourceLogic#stopConsumerActor by sending a KafkaConsumerActor.Internal.Stop message.

This is done by invoking akka.stream.Materializer#scheduleOnce as follows:

protected def stopConsumerActor(): Unit =
    materializer.scheduleOnce(settings.stopTimeout, new Runnable {
      override def run(): Unit =
        consumerActor.tell(KafkaConsumerActor.Internal.Stop, sourceActor.ref)
    })

However, scheduleOnce actually takes a delay as first parameter:

def scheduleOnce(delay: FiniteDuration, task: Runnable): Cancellable

Meaning any akka.kafka.consumer.stop-timeout greater than zero (the default is 30) will actually result in delaying the consumer shutdown by the configured timeout (which ironically causes a timeout)

cSNiHab commented 5 years ago

After thinking about it some more, I realized that this might actually be intentional to delay the shutdown in a simple delay-based shutdown scenario.

However, I feel like this conflicts with the controlled shutdown documentation at https://doc.akka.io/docs/alpakka-kafka/current/consumer.html#controlled-shutdown

The controlled-shutdown documentation should mention that the stop-timeout property should be set to 0 to avoid an unnecessary 30 second delay at step 3 even when not using the Draining Control 3. Consumer.Control.shutdown() to wait for all outstanding commit requests to finish and stop the Kafka Consumer.

I'll close this issue as this is not a bug

seglo commented 5 years ago

Your realization is correct, although I think it does make sense to re-evaluate the usefulness of the stop-timeout configuration setting (or at least its default of 30s), especially now that the draining control exists.

We would welcome an update to the docs to make them more clear.