Closed koshdim closed 4 years ago
Hi @koshdim ,
You configure your NumStreamThreads to 2, but in your logs I saw more threads : (Minimum) data-ingestion-appkstreamtestin-stream-thread-1 data-ingestion-appkstreamtestin-stream-thread-4 data-ingestion-appkstreamtestin-stream-thread-3 data-ingestion-appkstreamtestin-stream-thread-5 data-ingestion-appkstreamtestin-stream-thread-7 data-ingestion-appkstreamtestin-stream-thread-6 data-ingestion-appkstreamtestin-stream-thread-0 data-ingestion-appkstreamtestin-stream-thread-2
I saw also, you have a rebalacing just before the exception. Is-it normal that you have a rebalancing ? Did you change the number of partitions of a topic or launch another Kafka Streams application?
Thanks.
oh, sorry, I posted config after I changed threads to 2, but I was able to reproduce with 2 as well. Number of partitions didn't change. I was experimenting with running two streams at once in the same application (different ApplicationId), and stopping/running this application multiple times. do Kafka Streams conflict somehow if run on the same physical machine?
You have .NET application which contains two instances of KafkaStreams. Each one has an unique applicationId. Is it correct ? You alternate stopping/running this application to test resilience.
Are you call stream.Close() to each instance when your application is stopping ?
Are you call stream.Close() to each instance when your application is stopping ?
not always, sometimes it crashed, sometimes I stopped debugging. could there be some leftovers after application is closed? if yes, how do I cleanup?
It’s probably this. I have to change public Kafka streams API and especially close method.
In production, stopping stream properly is mandatory.
how to close it properly if application crashed? I assumed everything is wiped out when relevant process dies
Normally yes, if process dies all resources will be free. Some good rules : 1 - surround you Kafka stream instance with a try catch and call Close in finally section 2 - When you reproduce, is it possible to dump memory process and sent in attachment of this issue. Maybe I could analyze it and fix it.
hi @LGouellec,
tried to reproduce it again and got several other exceptions. I'm sorry, but will cease the investigation of Streams and try to achieve what I do with Kafka transactions. In case it might be interesting for you here some info about exceptions: a)
Streamiz.Kafka.Net.Errors.StreamsException
HResult=0x80131500
Message=Collection was modified; enumeration operation may not execute.
Source=Streamiz.Kafka.Net
StackTrace:
at Streamiz.Kafka.Net.Processors.StreamThread.Run()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()
This exception was originally thrown at this call stack:
[External Code]
Inner Exception 1: InvalidOperationException: Collection was modified; enumeration operation may not execute.
b)
Streamiz.Kafka.Net.Errors.StreamsException HResult=0x80131500 Message=Operation not valid in state FatalError Source=Streamiz.Kafka.Net StackTrace: at Streamiz.Kafka.Net.Processors.StreamThread.Run() at System.Threading.ThreadHelper.ThreadStart_Context(Object state) at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.ThreadHelper.ThreadStart()
This exception was originally thrown at this call stack: [External Code]
Inner Exception 1: KafkaException: Operation not valid in state FatalError
and debug log of b)
case: 16337 [27] DEBUG Streamiz.Kafka.Net.Kafka.Internal.KafkaLoggerAdapter - Log producer data-ingestion-appkstreamtestin-stream-thread-0-0-0-producer#producer-5 - [thrd:main]: TxnCoordinator/3: Failed to add partition "KStreamTestOut" [2] to transaction: Broker: Producer attempted an operation with an old epoch
16339 [27] DEBUG Streamiz.Kafka.Net.Kafka.Internal.KafkaLoggerAdapter - Log producer data-ingestion-appkstreamtestin-stream-thread-0-0-0-producer#producer-5 - [thrd:main]: Fatal transaction error: Failed to add partitions to transaction: Broker: Producer attempted an operation with an old epoch (INVALID_PRODUCER_EPOCH)
16348 [27] DEBUG Streamiz.Kafka.Net.Kafka.Internal.KafkaLoggerAdapter - Log producer data-ingestion-appkstreamtestin-stream-thread-0-0-0-producer#producer-5 - [thrd:main]: Fatal error: Broker: Producer attempted an operation with an old epoch: Failed to add partitions to transaction: Broker: Producer attempted an operation with an old epoch
16352 [25] ERROR Streamiz.Kafka.Net.Kafka.Internal.KafkaLoggerAdapter - Error producer data-ingestion-appkstreamtestin-stream-thread-0-0-0-producer#producer-5 - Failed to add partitions to transaction: Broker: Producer attempted an operation with an old epoch
16355 [data-ingestion-appkstreamtestin-stream-thread-0] ERROR Streamiz.Kafka.Net.Processors.StreamThread - stream-thread[data-ingestion-appkstreamtestin-stream-thread-0] Encountered the following unexpected Kafka exception during processing, tis usually indicate Streams internal errors:
Confluent.Kafka.KafkaException: Operation not valid in state FatalError
at Confluent.Kafka.Impl.SafeKafkaHandle.SendOffsetsToTransaction(IEnumerable1 offsets, IConsumerGroupMetadata groupMetadata, Int32 millisecondsTimeout) at Confluent.Kafka.Producer
2.SendOffsetsToTransaction(IEnumerable1 offsets, IConsumerGroupMetadata groupMetadata, TimeSpan timeout) at Streamiz.Kafka.Net.Processors.StreamTask.Commit(Boolean startNewTransaction) at Streamiz.Kafka.Net.Processors.StreamTask.Commit() at Streamiz.Kafka.Net.Processors.StreamThread.Run() 16408 [data-ingestion-appkstreamtestin-stream-thread-0] INFO Streamiz.Kafka.Net.Processors.StreamThread - stream-thread[data-ingestion-appkstreamtestin-stream-thread-0] Shutting down 16409 [data-ingestion-appkstreamtestin-stream-thread-0] INFO Streamiz.Kafka.Net.Processors.StreamThread - stream-thread[data-ingestion-appkstreamtestin-stream-thread-0] State transition from RUNNING to PENDING_SHUTDOWN 16411 [data-ingestion-appkstreamtestin-stream-thread-0] INFO Streamiz.Kafka.Net.Processors.StreamTask - stream-task[0|0] Closing 16411 [data-ingestion-appkstreamtestin-stream-thread-0] DEBUG Streamiz.Kafka.Net.Processors.StreamTask - stream-task[0|0] Suspending 16411 [data-ingestion-appkstreamtestin-stream-thread-0] DEBUG Streamiz.Kafka.Net.Processors.StreamTask - stream-task[0|0] Comitting 16412 [data-ingestion-appkstreamtestin-stream-thread-0] DEBUG Streamiz.Kafka.Net.Processors.Internal.ProcessorStateManager - stream-task[0|0] Flushing all stores registered in the state manager 16412 [data-ingestion-appkstreamtestin-stream-thread-0] DEBUG Streamiz.Kafka.Net.Kafka.Internal.RecordCollector - stream-task[0|0] Flusing producer 16415 [data-ingestion-appkstreamtestin-stream-thread-0] DEBUG Streamiz.Kafka.Net.Processors.Internal.RecordQueue - stream-task[0|0] - recordQueue [record-queue-KStreamTestIn-0-0] cleared ! 16415 [data-ingestion-appkstreamtestin-stream-thread-0] ERROR Streamiz.Kafka.Net.Processors.StreamThread - stream-thread[data-ingestion-appkstreamtestin-stream-thread-0] Failed to close stream thread due to the following error: Confluent.Kafka.KafkaException: Operation not valid in state FatalError at Confluent.Kafka.Impl.SafeKafkaHandle.AbortTransaction(Int32 millisecondsTimeout) at Confluent.Kafka.Producer
2.AbortTransaction(TimeSpan timeout)
at Streamiz.Kafka.Net.Processors.StreamTask.Suspend()
at Streamiz.Kafka.Net.Processors.StreamTask.Close()
at Streamiz.Kafka.Net.Processors.Internal.TaskManager.Close()
at Streamiz.Kafka.Net.Processors.StreamThread.Close(Boolean cleanUp)
Description
I created an application to try your package, and from time to time I get this exception: Streamiz.Kafka.Net.Errors.StreamsException: 'Operation not valid in state Ready' here is the log that might be helful:
How to reproduce
I don't know how to reproduce, but occurs fairly regularly. my config
Checklist
Please provide the following information: