BrentOzarULTD / SQL-Server-First-Responder-Kit

sp_Blitz, sp_BlitzCache, sp_BlitzFirst, sp_BlitzIndex, and other SQL Server scripts for health checks and performance tuning.
http://FirstResponderKit.org
Other
3.38k stars 1k forks source link

Enable sp_DatabaseRestore to use Blob Storage #3572

Open gdoddsy opened 2 months ago

gdoddsy commented 2 months ago

Is your feature request related to a problem? Please describe. I have a need to restore data from Azure Blob storage (without copying locally first). The backups were created with Ola's script directly to a file share, and then transferred to Blob Storage. The files maintain their current path (i.e. {databaseName}/{full|diff|log}/{original filename}) in the storage container.

Describe the solution you'd like Be able to pass sp_databaseRestore the location of the backups, an access key, and have it restore directly from the cloud.

Describe alternatives you've considered Downloading it first to a file share, then restoring.

Are you ready to build the code for the feature? I've got this in development, should have a pull request soon.

It will have limitations:

  1. The SQL Server will need to be able to run powershell and Az.Storage module will need to be available to install and import. I don't know how people will feel about this, I have an alternative below.
  2. The user will need to create and provide a shared access key to enable powershell to query the blob storage
  3. The database will need a credential already created with access to the container to do the restore

The main issue I'm facing is getting the list of files to potentially restore. I essentially need 2 different ways to access Azure Storage - firstly for Powershell to list the files, then for SQL to restore the files. The 2nd one I am ok with, it's a SQL Credential and it's doing it the way that MS seems to suggest. With this approach, we'd need to add the following parameters to the script:

EXEC sp_DatabaseRestore @Database = N'Database',                -- nvarchar(128)
                        @BackupPathFull = N'Database/Full', -- nvarchar(260)
                        @BackupPathDiff = N'Database/Diff', -- nvarchar(260)
                        @RunCheckDB = 0,                    -- bit
                        @RestoreDiff = 1,                   -- bit
                        @RunRecovery = 1,                   -- bit
                        @StopAt = N'20240921000000',        -- nvarchar(14)
                        @DatabaseOwner = 'sa',              -- sysname
                        @Debug = 0,
                        @Execute = 'Y',
                        @MoveFiles=1,
                        @MoveDataDrive = 'C:\Database\Data',
                        @MoveLogDrive = 'C:\Database\Logs'
--New params:
                        @BackupSource = 'URL',
                        @StorageAccountName = 'YourStorageAccount',
                        @StoargeContainerName = 'YourContainer',
                        @StorageAccountKey = 'ProvideAKey',

This allows SQL to import and run PowerShell commands to query the Storage Container and retrieve a list of files that match the pattern in @BackupPathFull (and diffs and logs).

Option 2 is to use the comma separated abilities of @backupPathFull (and diffs and logs). You do the work first to retrieve the list and pass the string in, we then leverage the split, and do the restore which only requires a SQL Credential to be created.

I'm in favour of the first option as it makes my life simpler, and will create the pull request that way, but open to feedback if that's going to be a problem (users can always just ignore this feature if it is a problem in their environment).

BrentOzar commented 1 month ago

I'm totally fine with the first option because I think the use case is so limited today. So few folks are asking to do it, so ANY way that we support it is probably good. I won't be able to help with the code, so I wanna empower you to be able to do it the fastest, easiest way you can, and in a way that you can support if questions come up. Sounds good!

gdoddsy commented 1 month ago

I hit some more problems. Current plan is to leave it up to the user to pass in a full list of files along with the folder paths. I hit too many problems with trying to execute the commands from within SQL - too many access problems for my production environment.

I'll be back in a week or 2 to fix it.

gdoddsy commented 2 days ago

So, it's been more than a week. I hit some problems and then had other priorities, but I'm back on this now. The issue I've got is that I want to keep the powershell simple and return all files that could be used for restoring, and once you've got months of 5 minute log backups files this becomes a super long string that you need to pass into the SQL parameter.

Which brings me to, how do you feel about OPENJSON being in the stored proc?

BrentOzar commented 2 days ago

I don't have a problem with OPENJSON. Looks like it's compat level 130 and higher, which is fine.