apache / pulsar-dotpulsar

The official .NET client library for Apache Pulsar
https://pulsar.apache.org/
Apache License 2.0
234 stars 62 forks source link

SendChannel.Completion causes the producer to become Faulted #209

Closed libzlibz closed 6 months ago

libzlibz commented 7 months ago

Description

SendChannel.Completion causes the producer to become Faulted

Reproduction Steps

// ---Thread1--- // while(true) { // a:producer.Send a batch of data // a:await producer.SendChannel.Completion // ->a:await _sendQueue.WaitForEmpty // ->a:tcs = new TaskCompletionSource(); // ->a:_queueEmptyTcs.Add(tcs); // ->a:await tcs.Task // // ---Thread2--- // SubProducer::ProcessReceipt // ->_sendQueue.Dequeue(); // ->if (_queue.Count == 0) // ->NotifyQueueEmptyAwaiters(); // ->1:foreach (var tcs in _queueEmptyTcs) // ->1: tcs.TrySetResult(0); // // Return to running the statement following: await tcs.Task // b:producer.Send // b:await producer.SendChannel.Completion // ->b:_queueEmptyTcs.Add(tcs);(Modified _queueEmptyTcs) // ->b:await tcs.Task // // ---Thread2------------------------- // ->2:foreach (var tcs in _queueEmptyTcs) //When running within foreach, _queueEmptyTcs has been modified and there is an issue at this time // Throwing exception: Collection was modified; enumeration operation may not execute // producer set to Faulted // }

// TrySetResult will notify await tcs.Task completed, then Thread1 continues to run, and then b: _queueEmptyTcs.Add(tcs) modified _queueEmptyTcs, causing foreach issue in Thread2 // Additionally, it was found that a:await tcs.Task was return after 1:tcs.TrySetResult(0), The code behind the (await tcs.Task) runs in Thread2, so every time you enter NotifyQueueEmptyAwaiters, there will always be a problem // Do not add: lock (_queue) to void NotifyQueueEmptyAwaiters(), After testing, on the same thread, it was found that it will not lock, but errors will still be reported

Expected behavior

Normal and stable production data

Actual behavior

producer to become Faulted

Regression?

No response

Known Workarounds

No response

Configuration

No response

Other information

No response

blankensteiner commented 7 months ago

@kandersen82 could you have a look at this?

kandersen82 commented 7 months ago

Hi @libzlibz. Are you able to provide a valid C# example illustrating the problem?

entvex commented 6 months ago

@libzlibz Could you link to a repository showcasing the issue ?

blankensteiner commented 6 months ago

@libzlibz We appreciate the report, but we need a working C# sample showing the bug. Can you provide us with that?

blankensteiner commented 6 months ago

Closed due to inactivity