Would it be worthwhile to reduce the scope of SyncRoot in ReceivingAmqpLink?

danielmarbach commented 3 years ago

Would it be possible to inherit SizeBasedFlowQueue from ConcurrentQueue and then reduce the scope of SyncRoot to the waiterlist and SetLinkCreditUsingTotalCacheSize access only to reduce lock contention when the queue has data?

xinchen10 commented 3 years ago

By default the size based link credit mode is off. Right now there is no known usage of it. So the queue behaves just as a normal Queue. The common usage pattern of a receiving link is 1) a message pump calling receive in a loop, 2) a message listener via the callback. In both cases the concurrency to access the queue is very low. Lock contention could be an issue if there are many concurrent receive calls. Do you see lock contention in some tests you did?

danielmarbach commented 3 years ago

I did a timeline analysis of a regular ServiceBus SDK receiver some some concurrency

danielmarbach commented 3 years ago

// FYI @JoshLove-msft @jsquire

xinchen10 commented 3 years ago

I did a CPU usage profiling and this is the data I got (the queue is preloaded with enough messages). The test is a console app for net452.

Test code

        protected async Task ReceiveOneAsync()
        {
            while (true)
            {
                AmqpMessage message = await this.link.ReceiveMessageAsync(this.link.DefaultOpenTimeout);
                if (message != null)
                {
                    this.link.AcceptMessage(message, true);
                    if (!this.Attempt())
                    {
                        break;
                    }
                }
            }
        }

I then started different number of tasks to control concurrency. The CPU% is inclusive and it is 100% for the process.

Concurrency	CPU% of BeginReceiveMessages	CPU% of lock(this.SyncRoot)
1	8	0.06
2	8.5	1.33
3	9.89	3.97
5	14	7
10	15.29	8.38

It is as expected. Lock contention increases when concurrency level (# of receive tasks) increases but it is not as high as shown in your timeline profiling. Use ConcurrentQueue may help lock contention but I don't have the data for that yet.

danielmarbach commented 3 years ago

I used a concurrency of 100 in the example that created the screenshot

xinchen10 commented 3 years ago

What is exactly that 100 for? Concurrent tasks, or threads? What is the thread pool size if tasks are used?

I think the current issue is that in non-prefetch mode, the credits are sent very frequently and that is done inside the lock (IssueCredit calls). I moved it out and with 10 tasks, lock contention is reduced to 3%. I am not sure if using ConcurrentQueue for messages will help further here because we still need a lock for accesses to the waiter list. Even if we could make the waiter list lock free , then we would need something else to synchronize the message queue and waiter list.

danielmarbach commented 3 years ago

Yeah good call. Did some further spiking last night too and saw the IssueCredit being the culprit once I swapped to a concurrent queue since that did not help with the lock contention.

I used the Azure Service Bus SDK Processor type. That one has a setting called Max Concurrent alls which I set to 100. Example https://github.com/danielmarbach/AzureServiceBus.DeepDive/blob/master/Receive/Program.cs#L39

I'm not sure if I have set the prefetch settings there since I used a modified version of the above code that I haven't pushed yet. I can double-check later.

jsquire commented 3 years ago

Assuming a scenario where prefetch is almost always used, such as with the Event Hubs SDK, would the ConcurrentDictionary be helpful?

danielmarbach commented 3 years ago

The juggling in a good key on that dictionary might be nontrivial. I need to have a deeper look at the code at some point. Maybe it triggers some good ideas

danielmarbach commented 3 years ago

@xinchen10 I just saw https://github.com/Azure/azure-amqp/commit/31dc27ca8eda003d2a5bcdda77d51f43e67c934b. This is awesome!

Azure / azure-amqp

Would it be worthwhile to reduce the scope of SyncRoot in ReceivingAmqpLink? #179