Azure / azure-sdk-for-go

This repository is for active development of the Azure SDK for Go. For consumers of the SDK we recommend visiting our public developer docs at:
https://docs.microsoft.com/azure/developer/go/
MIT License
1.58k stars 816 forks source link

ServiceBus Admin Client - receives 401 responses after migration from standard to premium tier #21172

Open dldl-cmd opened 1 year ago

dldl-cmd commented 1 year ago

Bug Report

When using the ServiceBus Admin client it does not automatically connect to the Premium namespace when a Standard Tier namespace is migrated to premium as documented here: https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-migrate-standard-premium#what-happens-when-the-migration-is-committed Instead it always runs into an unauthenticated error. Restarting the application does fix the issue, but according to the documentation it should not be necessary.

GET https://XXX.servicebus.windows.net/TOPIC/Subscriptions/SUBSCRIPTION-------------------------------------------------------------------------------- RESPONSE 401: 401 Unauthorized ERROR CODE: 401 -------------------------------------------------------------------------------- Error Code 401 Detail Connection rejected after GeoDR failover. TrackingId:e6768e23-213f-476c-a654-f7bab349c5af_G4, SystemTracker:XXX.servicebus.windows.net:TOPIC/Subscriptions/SUBSCRIPTION, Timestamp:2023-05-16T13:24:18 --------------------------------------------------------------------------------

The problem was noticed with KEDA: kedacore/keda#4678.

The problem can be easily reproduced with running the following program:

package main

import (
    "context"
    "fmt"
    "time"

    "github.com/Azure/azure-sdk-for-go/sdk/messaging/azservicebus/admin"
)

func GetAdminClient() *admin.Client {
    connectionString := "<connection string>"
    client, err := admin.NewClientFromConnectionString(connectionString, nil)
    if err != nil {
        panic(err)
    }

    return client

}

func PrintMessageCountOrError(ctx context.Context, client *admin.Client) {
    properties, err := client.GetSubscriptionRuntimeProperties(ctx, "mytopic", "mysubscription", nil)
    if err != nil {
        fmt.Println(err)
    } else {
        fmt.Println(properties.TotalMessageCount)
    }
}

func main() {
    ctx := context.TODO()
    client := GetAdminClient()

    for {
        time.Sleep(1)
        PrintMessageCountOrError(ctx, client)
    }
}

Furthermore the Standard Tier ServiceBus namespace needs to have a topic: mytopic and a subscription: mysubscription. Then during the runtime of the program the namespace can be migrated to a Premium namespace through the Azure Portal. After the migration is completed (a few seconds/minutes later) the application will start to log, that it is not able to connect anymore.

As far as I was able to debug the problem already, it occurs because the connection is after the migration still to the standard namespace. Therefore restarting the application causes it to connect to the new premium namespace. Also doing a long pause in the debugger, so that the connection gets closed, causes it to work again.

richardpark-msft commented 1 year ago

Sorry about taking so long on this, a lot going on!

I'm talking with the service team. I think the ideal behavior is for the service to cut off old HTTP connections as well. I don't think we want to do string matching on 401s.

I'll ping back when I have something to report.

github-actions[bot] commented 8 months ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @EldertGrootenboer @jfggdl.

richardpark-msft commented 8 months ago

@EldertGrootenboer , @jfggdl - I brought this up previously in the Service Bus office hours, so there should be some previous record of that discussion.

The fix here would be that when the service is rotated out that the old HTTP endpoint should probably close any active connections. Due to connection pooling, etc... it could take awhile for any connected clients to realize that the service they have a persistent connection to longer valid.

EldertGrootenboer commented 8 months ago

Thank you, we will assign an engineer to investigate this, and will report back in this thread once we have more information to share.

EldertGrootenboer commented 4 months ago

We have brought this item in our current planning. We don't have a specific date when development will start for this, once we have more information around this, we will update this thread.