Azure / azure-cli

Azure Command-Line Interface
MIT License
3.92k stars 2.89k forks source link

"az mysql server-logs download" fails with "[Errno 104] Connection reset by peer" on large (>32MB) current-day logs #18512

Open sbonds opened 3 years ago

sbonds commented 3 years ago

Describe the bug

Command Name

az mysql server-logs download --name mysql-slow-$PAAS_HOST-2021061605.log --resource-group $PAAS_HOST_RG --server $PAAS_HOST

Errors:

The command failed with an unexpected error. Here is the traceback:
[Errno 104] Connection reset by peer
Traceback (most recent call last):
  File "/usr/lib64/az/lib/python3.6/site-packages/knack/cli.py", line 231, in invoke
    cmd_result = self.invocation.execute(args)
  File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 657, in execute
    raise ex
  File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 720, in _run_jobs_serially
    results.append(self._run_job(expanded_arg, cmd_copy))
  File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 691, in _run_job
    result = cmd_copy(params)
  File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 328, in __call__
    return self.handler(*args, **kwargs)
  File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/command_operation.py", line 121, in handler
    return op(**command_args)
  File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/command_modules/rdbms/custom.py", line 574, in _download_log_files
    urlretrieve(f.url, f.name)
  File "/usr/lib64/python3.6/urllib/request.py", line 277, in urlretrieve
    block = fp.read(bs)
  File "/usr/lib64/python3.6/http/client.py", line 459, in read
    n = self.readinto(b)
  File "/usr/lib64/python3.6/http/client.py", line 503, in readinto
    n = self.fp.readinto(b)
  File "/usr/lib64/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
  File "/usr/lib64/python3.6/ssl.py", line 968, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/lib64/python3.6/ssl.py", line 830, in read
    return self._sslobj.read(len, buffer)
  File "/usr/lib64/python3.6/ssl.py", line 587, in read
    v = self._sslobj.read(len, buffer)
ConnectionResetError: [Errno 104] Connection reset by peer

To Reproduce

Find a MySQL PaaS host which generates a large "slow queries" log. Download the currently updating log with the command above using the $PAAS_HOST variable for the PaaS host name in Azure.

Expected behavior

The requested log file will (eventually) be downloaded.

Environment summary

Linux-3.10.0-957.21.3.el7.x86_64-x86_64-with-centos-7.6.1810-Core
Python 3.6.8
Installer: RPM

azure-cli 2.25.0

Extensions:
image-copy-extension 0.2.8

Additional context

Microsoft case 2106020010003350 was opened for this issue. In the process of investigating that issue, we found that the connection reset behavior from the Azure side is intentional and needs to be handled by the Az CLI.

In particular, the blob will return connection reset when the file is written. We've noticed that this seems to happen in 32MB chunks, potentially due to upstream write buffering, hence the "large" requirement for replicating the issue.

Suggested fix

Microsoft proposed either using a lease, which is an exclusive write lock on the file or a snapshot. Write-locking a file used by the PaaS service seem like it would lead to unintended consequences on the back end, so I don't think this is a good plan.

Creating a snapshot, downloading the snapshot, then removing the snapshot seems like a very good plan for ensuring write consistency. This would be my suggested way of fixing/working around this bug.

Workaround with curl

A workaround is to use az mysql server-logs list --resource-group $PAAS_HOST_RG --server $PAAS_HOST to get a list of URLs with SAS for the logs. The appropriate SAS URL for the log in question can be fed to a very recent version of "curl" (newer than 7.52.0) which supports --retry-connrefused which will allow the download to complete. For details about curl retry, see https://stackoverflow.com/questions/42873285/curl-retry-mechanism

ghost commented 3 years ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @ambhatna, @savjani.

Issue Details
**Describe the bug** **Command Name** `az mysql server-logs download --name mysql-slow-$PAAS_HOST-2021061605.log --resource-group $PAAS_HOST_RG --server $PAAS_HOST` **Errors:** ``` The command failed with an unexpected error. Here is the traceback: [Errno 104] Connection reset by peer Traceback (most recent call last): File "/usr/lib64/az/lib/python3.6/site-packages/knack/cli.py", line 231, in invoke cmd_result = self.invocation.execute(args) File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 657, in execute raise ex File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 720, in _run_jobs_serially results.append(self._run_job(expanded_arg, cmd_copy)) File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 691, in _run_job result = cmd_copy(params) File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 328, in __call__ return self.handler(*args, **kwargs) File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/core/commands/command_operation.py", line 121, in handler return op(**command_args) File "/usr/lib64/az/lib/python3.6/site-packages/azure/cli/command_modules/rdbms/custom.py", line 574, in _download_log_files urlretrieve(f.url, f.name) File "/usr/lib64/python3.6/urllib/request.py", line 277, in urlretrieve block = fp.read(bs) File "/usr/lib64/python3.6/http/client.py", line 459, in read n = self.readinto(b) File "/usr/lib64/python3.6/http/client.py", line 503, in readinto n = self.fp.readinto(b) File "/usr/lib64/python3.6/socket.py", line 586, in readinto return self._sock.recv_into(b) File "/usr/lib64/python3.6/ssl.py", line 968, in recv_into return self.read(nbytes, buffer) File "/usr/lib64/python3.6/ssl.py", line 830, in read return self._sslobj.read(len, buffer) File "/usr/lib64/python3.6/ssl.py", line 587, in read v = self._sslobj.read(len, buffer) ConnectionResetError: [Errno 104] Connection reset by peer ``` **To Reproduce** Find a MySQL PaaS host which generates a large "slow queries" log. Download the currently updating log with the command above using the $PAAS_HOST variable for the PaaS host name in Azure. **Expected behavior** The requested log file will (eventually) be downloaded. **Environment summary** ``` Linux-3.10.0-957.21.3.el7.x86_64-x86_64-with-centos-7.6.1810-Core Python 3.6.8 Installer: RPM azure-cli 2.25.0 Extensions: image-copy-extension 0.2.8 ``` **Additional context** Microsoft case 2106020010003350 was opened for this issue. In the process of investigating that issue, we found that the connection reset behavior from the Azure side is intentional and needs to be handled by the Az CLI. In particular, the blob will return connection reset when the file is written. We've noticed that this seems to happen in 32MB chunks, potentially due to upstream write buffering, hence the "large" requirement for replicating the issue. **Suggested fix** Microsoft proposed either using a lease, which is an exclusive write lock on the file or a snapshot. Write-locking a file used by the PaaS service seem like it would lead to unintended consequences on the back end, so I don't think this is a good plan. Creating a snapshot, downloading the snapshot, then removing the snapshot seems like a very good plan for ensuring write consistency. This would be my suggested way of fixing/working around this bug. **Workaround with curl** A workaround is to use `az mysql server-logs list --resource-group $PAAS_HOST_RG --server $PAAS_HOST` to get a list of URLs with SAS for the logs. The appropriate SAS URL for the log in question can be fed to a very recent version of "curl" (newer than 7.52.0) which supports `--retry-connrefused` which will allow the download to complete. For details about curl retry, see [https://stackoverflow.com/questions/42873285/curl-retry-mechanism](https://stackoverflow.com/questions/42873285/curl-retry-mechanism)
Author: sbonds
Assignees: -
Labels: `MySQL`, `Service Attention`
Milestone: -
yonzhan commented 3 years ago

route to service team

savjani commented 3 years ago

Adding @Bashar-MSFT to help