Azure / azure-sdk-for-js

This repository is for active development of the Azure SDK for JavaScript (NodeJS & Browser). For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/javascript/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-js.
MIT License
1.98k stars 1.15k forks source link

Service Bus Sessions will automatically disconnect without warning #30148

Open shuaipeng123 opened 1 week ago

shuaipeng123 commented 1 week ago

Describe the bug Service Bus Sessions will automatically disconnect without warning.

To Reproduce Steps to reproduce the behavior:

  1. Set up multiple topic subscriptions with Azure Service Bus and Sessions enabled.
  2. Use the ServiceBusSessionReceiver from the '@azure/service-bus' SDK to read messages.
  3. Allow the services to run for an extended period (typically 12 hours or more).
  4. Observe the message count in the queue and the logs.

Expected behavior The ServiceBusSessionReceiver should maintain a consistent connection to the Service Bus, ensuring that messages are read and processed continuously without requiring a restart of the service.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Tech Stack

Azure Service Bus with Sessions enabled NodeJS ServiceBusSessionReceiver from '@azure/service-bus' SDK Problem Description The issue was first noticed when multiple topic subscriptions were not processing messages. The active message count in the queue remained static for days.

Restarting the services with built-up messages would consume the messages, as evidenced by the logs. This indicated that the receiver was disconnected. We added a log to notify us when the receiver was disconnected using the .isClosed method on the ServiceBusSessionReceiver object:

if (this.serviceBusReceiver.isClosed) {
    this.logger.error("Service Bus Receiver connection is closed");
}

This code runs every 5 minutes using NodeJS.Timeout interval. This log was triggered within 12 hours in one of the services that failed to read the messages due to disconnection. The others did not log this error. Restarting the service removed the error log.

What We've Looked Into

Unhandled error in message handler causes disconnection.

After adding the error logger, we observed no unhandled errors before disconnection. Restarting the application reconnected the service bus receiver and cleared the logs.

Our message handler:

async receiveMessages(): Promise<void> {
    const myMessageHandler = async (messageReceived: ServiceBusReceivedMessage): Promise<void> => {
        // logic inserts it into storage
    };

    const myErrorHandler = async (errorArgs: ProcessErrorArgs): Promise<void> => {
        this.logger.error(
            errorArgs.error,
            `Service Bus receiver failed - topic name: ${TOPIC_NAME} - error arguments: ${JSON.stringify(errorArgs)}`,
        );
    };

    this.serviceBusReceiver.subscribe({
        processMessage: myMessageHandler,
        processError: myErrorHandler,
    });
}

Conditions that flip ._isClosed property to true

We checked the source code to see how the .isClosed property of the ServiceBusSessionReceiver is set. Below is the .isClosed() method from the azure/service-bus source code:

public get isClosed(): boolean {
    return (
        this._isClosed ||
        !this._context.messageSessions[this._messageSession.name] ||
        !this._messageSession.isOpen()
    );
}

The only method that directly affects this._isClosed is the .close() method, which seems to be used explicitly by the user only.

ServiceBusSessionReceiver expires

We could not find any renewal logic for the Receiver's connections. We use default settings for the Service Bus, Topics, Subscriptions, and the ServiceBusSessionReceiver. Our receiver is instantiated as follows:

this.serviceBusReceiver = await this.serviceBusClient.acceptSession(TOPIC_NAME, SUBSCRIPTION_NAME, SESSION_ID); Only one of the subscriptions has logged the error since we added the error log for isClosed() == true. All subscriptions receive the same messages and use the SDK identically.

Reproducing We cannot directly reproduce the error. We have a log that notifies us when a receiver is closed. So far, there is no discernible pattern.

github-actions[bot] commented 1 week ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @EldertGrootenboer.

jeremymeng commented 1 week ago

Thanks for the report @shuaipeng123! May I ask whether you have noticed the issue after upgrading to 7.9.5, or the same issue has been happening even in previous version? We did some fix around sessions in PR ##29954. I just want to see whether they are related.

github-actions[bot] commented 6 days ago

Hi @shuaipeng123. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

github-actions[bot] commented 6 days ago

Hi @shuaipeng123. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.