Azure / azure-sdk-for-go

This repository is for active development of the Azure SDK for Go. For consumers of the SDK we recommend visiting our public developer docs at:
https://docs.microsoft.com/azure/developer/go/
MIT License
1.64k stars 840 forks source link

Data race in azblob client #23677

Open muroj opened 4 days ago

muroj commented 4 days ago

Bug Report

Environment Info

My program is detecting a data race while enumerating containers and blobs in parallel. Here is a minimal snippet demonstrating my approach.

client, err := azblob.NewClient(url, credential, nil)
ctrPager := client.NewListContainersPager(nil)

for ctrPager.More() {
  resp, err := ctrPager.NextPage(context.TODO())

  for _, c := range resp.ContainerItems {

    // limit blob enumeration, some containers have millions of blobs
    ctx, cancel := context.WithTimeout(context.Background(), time.Second*60)
    defer cancel()

    go func(c *service.ContainerItem) {
      blobPager := client.NewListBlobsFlatPager(*c.Name, nil)
      for blobPgr.More() {

      var blobs container.ListBlobsFlatResponse
        result := make(chan struct{})

        go func() {
          blobs, err = blobPgr.NextPage(ctx)
          handleError(err)
          result <- struct{}{}
        }()

        select {
        case <-ctx.Done():
          fmt.Printf("context cancelled while processing blobs for container: %s, %v\n", *c.Name, ctx.Err())
        case <-result:
          doStuff()
        }
      }
    }(c)
  }
}

Running this program results in the following data race, which occurs in the Blob Pager.

WARNING: DATA RACE
Write at 0x00c0001f4358 by goroutine 106:
  github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/container.(*Client).NewListBlobsFlatPager.func2()
      /Users/jmuro/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/storage/azblob@v1.4.1/container/client.go:283 +0x8c
  github.com/Azure/azure-sdk-for-go/sdk/azcore/runtime.(*Pager[go.shape.struct { github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/internal/generated.ListBlobsFlatSegmentResponse; ClientRequestID *string; ContentType *string; Date *time.Time; RequestID *string; Version *string }]).NextPage()
      /Users/jmuro/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/azcore@v1.14.0/runtime/pager.go:79 +0x5b4
  main.getContainerStats.func1()
      /Users/jmuro/acme/ops/util/azure/storage/utils/azure_storage_stats.go:168 +0x84

Previous write at 0x00c0001f4358 by goroutine 104:
  github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/container.(*Client).NewListBlobsFlatPager.func2()
      /Users/jmuro/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/storage/azblob@v1.4.1/container/client.go:283 +0x8c
  github.com/Azure/azure-sdk-for-go/sdk/azcore/runtime.(*Pager[go.shape.struct { github.com/Azure/azure-sdk-for-go/sdk/storage/azblob/internal/generated.ListBlobsFlatSegmentResponse; ClientRequestID *string; ContentType *string; Date *time.Time; RequestID *string; Version *string }]).NextPage()
      /Users/jmuro/go/pkg/mod/github.com/!azure/azure-sdk-for-go/sdk/azcore@v1.14.0/runtime/pager.go:79 +0x5b4
  main.getContainerStats.func1()
      /Users/jmuro/acme/ops/util/azure/storage/utils/azure_storage_stats.go:168 +0x84

I am new to go routines, contexts, and channels, so I wouldn't be surprised if my code is buggy. Regardless, I'd like confirmation on whether the azure go SDK is thread safe. In particular, is it safe to share the clients between go routines? This go sdk design doc implies that clients should be threadsafe, but is that the case? The .NET SDK doc explicitly mentions thread safety, but the go SDK doc does not.

github-actions[bot] commented 4 days ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.

jhendrixMSFT commented 4 days ago

Hello!

You are correct that SDK clients are goroutine safe. However, Pager[T] and Poller[T] types returned from SDK methods are not. In your example, the following is not goroutine safe.

        go func() {
          blobs, err = blobPgr.NextPage(ctx)
          handleError(err)
          result <- struct{}{}
        }()

Here, it's possible for multiple goroutines to call NextPage() on the same blobPgr instance which is not goroutine safe.

Please ping back if you have further questions.

github-actions[bot] commented 4 days ago

Hi @muroj. Thank you for opening this issue and giving us the opportunity to assist. We believe that this has been addressed. If you feel that further discussion is needed, please add a comment with the text "/unresolve" to remove the "issue-addressed" label and continue the conversation.

jhendrixMSFT commented 4 days ago

It appears this wasn't doc'ed. I've added doc comments about this in https://github.com/Azure/azure-sdk-for-go/pull/23679