dolthub / dolt

Dolt – Git for Data
Apache License 2.0
17.58k stars 498 forks source link

After Dolt CLI restore procedure database is not visible through the SQL client #7993

Closed awarycha closed 2 months ago

awarycha commented 2 months ago

Steps to reproduce:

Exception has occurred: CLR/MySqlConnector.MySqlException An unhandled exception of type 'MySqlConnector.MySqlException' occurred in Microsoft.EntityFrameworkCore.Relational.dll: 'can't create database mydb; database exists' at MySqlConnector.Core.ServerSession.<ReceiveReplyAsync>d__107.MoveNext() at System.Threading.Tasks.ValueTask1.get_Result() at System.Runtime.CompilerServices.ConfiguredValueTaskAwaitable1.ConfiguredValueTaskAwaiter.GetResult() at MySqlConnector.Core.ResultSet.<ReadResultSetHeaderAsync>d__2.MoveNext() at MySqlConnector.MySqlDataReader.ActivateResultSet(CancellationToken cancellationToken) at MySqlConnector.MySqlDataReader.<InitAsync>d__111.MoveNext() at MySqlConnector.Core.CommandExecutor.<ExecuteReaderAsync>d__0.MoveNext() at System.Threading.Tasks.ValueTask1.get_Result() at MySqlConnector.MySqlCommand.d__78.MoveNext() at System.Threading.Tasks.ValueTask1.get_Result() at MySqlConnector.MySqlCommand.ExecuteNonQuery() at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteNonQuery(RelationalCommandParameterObject parameterObject) at Microsoft.EntityFrameworkCore.Migrations.MigrationCommand.ExecuteNonQuery(IRelationalConnection connection, IReadOnlyDictionary2 parameterValues) at Microsoft.EntityFrameworkCore.Migrations.Internal.MigrationCommandExecutor.ExecuteNonQuery(IEnumerable1 migrationCommands, IRelationalConnection connection) at Pomelo.EntityFrameworkCore.MySql.Storage.Internal.MySqlDatabaseCreator.Create() at Microsoft.EntityFrameworkCore.Storage.RelationalDatabaseCreator.EnsureCreated() at Microsoft.EntityFrameworkCore.Infrastructure.DatabaseFacade.EnsureCreated()

But if I do the following steps everything is ok

fulghum commented 2 months ago

Hi @awarycha, thanks for reporting this issue. I'm starting to work on reproducing this behavior and will keep you updated with what we find.

fulghum commented 2 months ago

The issue here is that while the dolt sql-server is running, the database directory is added from the command line (as part of the restore/clone) and the sql-server doesn't automatically notice that new database directory on disk. At startup, dolt scans the directories to find all the available databases, and after that, it never rescans to see if new database directories have been added – it assumes that all create database, drop database, etc, calls will go through the SQL engine. That explains why when the .NET app restarts, it tries to recreate the database (since show databases; says it doesn't exist), but then when Dolt goes to create that database directory, it hits an existing directory on the filesystem, so it returns that error about the database already existing. That also explains why when the dolt sql-server is restarted, the app works correctly, since the sql-server now knows about about the new database that was added.

The easiest way to prevent this from happening is to use the SQL interface to restore a database, instead of using the CLI to restore the database while the sql-server is running. Unfortunately, the dolt_backup() stored procedure doesn't currently support the restore subcommand (https://github.com/dolthub/dolt/issues/6074). I'll dig into that today and see if we can get that available for you to use.

Until then... if you are blocked by this, you can switch to use the dolt_clone() stored procedure temporarily to clone the database. Clone and restore are very similar operations, with the biggest difference being how the remotes for the new database are created. With clone, you'll get an origin remote that points to the backup location (which likely isn't what you want), whereas with restore, you'll get the actual remotes from the backed up database restored. However, if you aren't using Dolt remotes, then this difference shouldn't matter for you.

Here's an example of calling the dolt_clone() stored procedure to clone a database that I had previously backed up. I verified that I was able to restart my .NET app (which calls EnsureCreated at startup) after cloning a backed up database, without having to restart the sql-server.

drop database dolt;
Query OK, 1 row affected (0.02 sec)

show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| june11             |
| mysql              |
+--------------------+

call dolt_clone('file:///Users/jason/dolt-backups', 'dolt');
+--------+
| status |
+--------+
|      0 |
+--------+

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| dolt               |
| information_schema |
| june11             |
| mysql              |
+--------------------+
fulghum commented 2 months ago

I've got a PR open that adds support for the restore subcommand in the dolt_backup() stored procedure. When used this way (from the SQL interface, instead of the CLI), the running sql-server is aware of the restored database, and you won't see the issue where you needed to restart the sql-server before it could see the new database. We're going to try to get this into tonight's release just in case you want to check it out.

Additionally... most of the Dolt CLI commands will automatically forward to the SQL stored procedure implementations so that while a sql-server is running, you can still use the CLI commands, but the SQL engine will process them. dolt backup doesn't do that yet, so that's why when you used it from the CLI, the sql-server didn't stay in sync. Now that we have support for restore in the dolt_backup() stored procedure, we can take the next step to have dolt backup forward to the sql-server (https://github.com/dolthub/dolt/issues/7628). That will completely smooth this out for you – you'll be able to use dolt backup from the command line without having to restart the sql-server to see the changes.

I'm going to work on moving dolt backup to use the dolt_backup() stored procedure next, and will keep you updated with progress.

awarycha commented 2 months ago

Thank you for very detailed answer. Now when I know that restart procedure is completely good and safe then it is enough for me. I didn't test SQL backup procedure yet but it is also very interesting concept, I will test it in the near future for sure. Thanks.

fulghum commented 2 months ago

Glad to hear you have what you need! I'll go ahead and resolve this issue, and we'll use https://github.com/dolthub/dolt/issues/7628 to track the work for making dolt backup on the CLI automatically forward to call dolt_backup() on the SQL engine. Once we have that in place, the workflow you used where you can dolt backup on the CLI will forward to the running sql-server, and then the sql-server won't need to be manually restarted to see new databases show up.

Thanks for taking the time to report this issue to us. 🙏 Let us know if you run into any other snags and we'll be happy to help.