Azure / azure-cli

Azure Command-Line Interface
MIT License
4.03k stars 3k forks source link

az sql db copy - performance characteristics: what is the impact on source db? #14864

Open t-l-k opened 4 years ago

t-l-k commented 4 years ago

I just wanted to enquire, because there is nothing in the documentation, but what are the performance or uptime considerations when executing az sql db copy ?

Like, what is it actually doing behind the scenes?

I ask because for some reason, when managing via az sql db export and az sql db import of bacpacs via storage, it seems to take a very long time, say about 1 to 1.5hrs, even if all in same region, for a large (say 60gb) database. In contrast, az sql db copy seems to do the same without even breaking a sweat, sometimes completing in about 2 to 3 minutes, even between servers?

Why? How does it achieve this speed? And what happens to the source DB during the copy? Is it still available? Is there some kind of cost/pricing dimension involved?

Can bacpacs be processed with similar efficiency?

TL;DR: I want to know if az sql db copy is safe to use with production databases as the source!

Thanks!

ghost commented 4 years ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @azureSQLGitHub.

yonzhan commented 4 years ago

sql

t-l-k commented 4 years ago

I would like to bump this issue, some insight into the behaviour would be useful, can you advise?

t-l-k commented 4 years ago

I just so happened to read an article recently at https://www.sqlshack.com/four-different-methods-to-copy-your-azure-sql-database/ - and the author Prashanth Jayaram claimed:

In this section, we will see the internals of the Azure Database copy operation.

  1. To copy the database in Azure uses geo-replication technology. Once replica seeding is complete, the geo-replication link is automatically terminated
  2. The geo-replication technology asynchronously replicates the committed transactions from the source database. The pre-requisites of using geo-replication is also applied to the database copy process
  3. A database copy is a process that refers to a transactional consistent snapshot of the source database
  4. The transactional consistent state defines the database that includes only the committed transaction stream after you initiate the database copy program

Note: By design, the Azure SQL Database by default provides two secondary copies of the database in the same data center. These secondary nodes are in sync with the primary copy of the database. All the read/write operations are performed on the primary copy. Additionally, the writes are replicated to the secondary copies

Okay so that is interesting, and certainly provides an avenue for further investigation, and interestingly is the note about two secondary copies by default - would these be used as a read-only source for the copy operation, so as not to affect the primary database?

t-l-k commented 3 years ago

I dived a little deeper, into the Azure CLI source and discovered a clue:

Copies a DB (i.e. create with 'Copy' create mode.)

That, in turn, I'm guessing, is calling Azure's REST endpoint, which in turn is calling the SQL instruction CREATE DATABASE ... AS COPY OF ....

Please let me know if this is the case, because then this documentation applies: Database Copies, and the ticket can be closed, with an action to perhaps clarify the command's behaviour in the documentation?

tkent commented 2 years ago

@yonzhan - would be great to get some confirmation of @t-l-k's investigation. This is something Microsoft should explain, not something users should have to try to guess at by picking through articles on the internet.

yonzhan commented 2 years ago

route to SQL service team to take a look

t-l-k commented 2 years ago

@yonzhan bump, any updates? Would be much appreciated.

t-l-k commented 2 years ago

I have had instances recently where using AzureSQL DB Copy's mechanism has negatively impacted performance of the source database, seemingly at the SQLClient layer. This really needs to be clearly explained! This is the best possible snippet I can that closely relates to the behaviour observed, from the Active geo-replication documentation:

If the primary is experiencing a heavy write workload, a geo-secondary with a lower compute size may not be able to keep up. That will cause replication lag on the geo-secondary, and may eventually cause unavailability of the geo-secondary. To mitigate these risks, active geo-replication will reduce (throttle) the primary's transaction log rate if necessary to allow its secondaries to catch up.

When a copied database is in state SEEDING, is it affecting the primary's transaction log rate? Or is it being replicated from a local secondary? Would it be possible to examine this further using the Monitoring geo-replication lag guidance?