Azure / azure-uamqp-c

AMQP library for C
Other
58 stars 62 forks source link

Receiver example stops receiving after 10000 events from EventHub #461

Closed jo-me closed 2 months ago

jo-me commented 5 months ago

I'm having trouble getting the message_receiver_sample to work in combination with an Azure EventHub.

It establishes a connection and starts to receive messages but then it just stops receiving after exactly 10000 messages. It does not quit or crash or show any log. It just does not seem to retrieve any data anymore in the connection_dowork function.

Is this a bug or is just the sample code outdated?

This is reproducible with the lastest master and also LTS_07_2022_Ref02.

I've used this lib previously and back then we used the branch/tag 2020-12-09, which continues to work to this day - so its not related to changes to the EventHubs. However, that branch is totally outdated and can't be built anymore with recent GCC versions due to https://github.com/Azure/azure-c-shared-utility/issues/535.

The only combination that worked both from compilation and runtime behavior perspective is LTS072020_Ref02 of this uamqp lib and LTS072022_Ref02 of azure-c-shared-utility

My test environment is a standard VSCode devcontainer based on mcr.microsoft.com/devcontainers/cpp:1-ubuntu-22.04 using CMake 3.22.2.

Jochen Mehlhorn jochen.mehlhorn@mercedes-benz.com, Mercedes-Benz Tech Innovation GmbH

Provider Information

jo-me commented 5 months ago

@ewertons this seems to be since #383 Is there maybe something missing in the receiver sample like ACKing received messages?

jo-me commented 5 months ago

when changing the line link_set_rcv_settle_mode(link, receiver_settle_mode_first); to link_set_rcv_settle_mode(link, receiver_settle_mode_second); it will continue to get messages..

According to https://github.com/Azure/azure-uamqp-python/blob/2d79b9c3544080cc524c7eec1b672a926d90bb53/uamqp/constants.py#L122 "first" is RECEIVEANDDELETE "second" is PEEKLOCK.

And according to https://learn.microsoft.com/en-us/azure/service-bus-messaging/message-transfers-locks-settlement#settling-receive-operations
Peek-Lock means that the receiver "wants to settle messages explicitly" while Receive-and-delete considers messages as "consumed as soon as the broker puts it onto the wire".

jo-me commented 5 months ago

As far as I can see the link credit does not seem to be handled correctly.

So it is sending "10000" to the Eventhub initially and not follow up afterwards when messages are getting processed.

The only place where it would reset it is here: https://github.com/Azure/azure-uamqp-c/blob/9c8c1805fac797e2d14a32ff6a7d9756a5ee82e1/src/link.c#L426C1-L427C1

But this function is not called when link credit is 0. If i change the if condition from 0 to 10 or so it will reset the link credit at the EventHub in time and therefore continue to receive events.

So I guess there is some issue with flow control somewhere..

jo-me commented 5 months ago

Seems to be related to https://github.com/Azure/azure-uamqp-c/pull/363

ewertons commented 2 months ago

Thanks for all the details, @jo-me

Here is the full root cause:

This is a problematic code, since if the current credit is already zero, the sender will not have sent another transfer.

else if (is_transfer_type_by_descriptor(descriptor))
{
    if (link_instance->on_transfer_received != NULL)
    {
        TRANSFER_HANDLE transfer_handle;
        if (amqpvalue_get_transfer(performative, &transfer_handle) != 0)
        {
            LogError("Cannot get transfer performative");
        }
        else
        {
            AMQP_VALUE delivery_state;
            bool more;
            bool is_error;

            if (link_instance->current_link_credit == 0)

That can be verified in the AMQP 1.0 spec in "2.6.7 Flow Control": "If the link-credit is less than or equal to zero, i.e. the delivery-count is the same as or greater than the delivery-limit, it is illegal to send more messages."

Here the code is already automatically replenishing the link credit if it gets "too low". In such case, it should check if the link credit gets down to 1 or less (for the last message that the sender can legally send) and then replenish, also sending a new flow to the sender.

Btw, this is a new bug caused by this change: https://github.com/Azure/azure-uamqp-c/pull/383/files

ewertons commented 2 months ago

@jo-me , a fix has been merged. Let us know if you need to follow up on this issue. Thanks. Azure IoT SDK Team.