Closed blawrence-datadx closed 9 months ago
Thanks for the feedback, we’ll investigate asap.
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.
Author: | blawrence-datadx |
---|---|
Assignees: | tasherif-msft |
Labels: | `bug`, `Storage`, `Service Attention`, `Client`, `customer-reported`, `needs-team-attention` |
Milestone: | - |
Hi @tasherif-msft, have you been able to look into this issue? It has started happening more and more frequently. I can give you example function invocation IDs if that would be helpful
Hi @blawrence-datadx thanks for opening the issue. This is a known issue that we are investigating and having discussions across languages to discuss what the best solution is. I will update you once I have more information!
Hi @tasherif-msft, this issue has started happening more and more frequently on a daily basis and not just from start_copy_from_url(). This is making the use of Azure Functions unsustainable for us and we're having to rely on retries which is not a clean solution. Please let us know if any progress has been made
Hi @blawrence-datadx, Connection resets come from the Storage service. The service team recommends retries as the solution for these types of intermittent failures so that is the best that can be done from the client side. We are aware that some of these errors are not automatically retried from the Storage SDK itself, and we are working to address that.
If you already have retry logic in place and are still experiencing resets or are concerned about the number of resets, I recommend pushing on the Support ticket you have opened or opening another ticket. The service team is really the one that should be looking into the high number of resets.
Thank you @jalauzon-msft. I'm afraid we cancelled our support service because after 3 months the service team made 0 progress. Before the connection reset errors were only happening on start_copy_from_url() calls, but now they are happening at random places throughout the code. So the best solution is really to wrap every single blob storage call with retry logic?
@blawrence-datadx can you share your storage account name and the last time period when this happened?
Hi @amishra-dev, the storage account name is datadxdatalake and the last time period it happened was Monday 2/14 at 8:41pm PST
Hi @blawrence-datadx, the SDK does have built-in retry logic that should be automatically retrying connection reset errors (with the exception of a potential known issue with download_blob()
). We apply the following policy to all clients by default:
https://github.com/Azure/azure-sdk-for-python/blob/835adb397badbf4ac08c3bb8fffcb24abb075ded/sdk/storage/azure-storage-blob/azure/storage/blob/_shared/policies.py#L536
Ideally these retries should be enough and you should not have to add your own retry logic for every operation. That being said, we have received a number of reports of Connection Reset errors despite this retry logic. We are now currently investigating why these retries do not seem to be sufficient and considering changing the default values.
Hello, I have a similar issue, it doesn't show any errors, but the copy doesn't occur. Can you confirm if this issue is resolved?
Closing out this old issue.
We have made a couple of improvements in this area over the past couple of releases to ensure we automatically retry connection reset errors as well as the service team has made some improvements on their end. Ultimately connection reset errors are going to occur from time to time when working with such a large service as Azure Storage. The automatic retries built into the SDK should mitigate most of issues that could be caused by this but if consistent connection reset errors are seen for Storage calls you may need to look for other causes, such as exhausted client networking resources, client load, etc. If everything still looks good from the client-side, then the best thing to do is to open a Support ticket to have the service team take a look.
Describe the bug I have a Python Azure Function that imports
BlobServiceClient
fromazure.storage.blob
and usesstart_copy_from_url()
to move blob files from a Standard/Hot StorageV2 container to a Standard/Cool StorageV2 container multiple times a day. Most of the time this works fine, but we have been getting more and moreConectionResetError
s every week. This issue used to happen once a month, but is now happening 3-5 times a week. We created a support ticket with Azure support and have made no progress for over two months. We've changed things like making sure our TLS settings match between our storage account and azure function and have updated our extension bundles, but nothing has changed the behavior. We're not sure how it's possible to get aConectionResetError
when the function app is hosted on Azure servers. To Reproduce Steps to reproduce the behavior:Expected behavior Files are archived from one storage account to the other with no errors.
Additional context Example stack trace of the exception: