simonw / datasette-app-support

Part of https://github.com/simonw/datasette-app
4 stars 2 forks source link

API endpoints in `datasette-app-support` for dump/restore of temporary in-memory database #9

Closed simonw closed 3 years ago

simonw commented 3 years ago

This is going to need some new API endpoints in datasette-app-support:

I'll leave it to the Electron app to define the file - I thought about using tempfile in Python but I don't think that will survive the server restart.

Originally posted by @simonw in https://github.com/simonw/datasette-app/issues/42#issuecomment-915770259

simonw commented 3 years ago

I go this far:

diff --git a/datasette_app_support/__init__.py b/datasette_app_support/__init__.py
index 3f99646..f6f9902 100644
--- a/datasette_app_support/__init__.py
+++ b/datasette_app_support/__init__.py
@@ -201,6 +201,20 @@ async def dump_temporary_to_file(request, datasette):
     return Response.json({"ok": True, "path": filepath})

+async def restore_temporary_from_file(request, datasette):
+    if not check_auth(request):
+        return unauthorized
+    try:
+        filepath = await _filepath_from_json_body(request)
+    except PathError as e:
+        return Response.json({"ok": False, "error": e.message}, status=400)
+    datasette.get_database("temporary")
+    backup_db = sqlite3.connect(filepath, uri=True, check_same_thread=False)
+    backup_db.isolation_level = None
+    backup_db.execute("vacuum into 'file:temporary?mode=memory&cache=shared'")
+    return Response.json({"ok": True, "path": filepath})
+
+
 @hookimpl
 def register_routes():
     return [
@@ -210,4 +224,5 @@ def register_routes():
         (r"^/-/import-csv-file$", import_csv_file),
         (r"^/-/auth-app-user$", auth_app_user),
         (r"^/-/dump-temporary-to-file$", dump_temporary_to_file),
+        (r"^/-/restore-temporary-from-file$", restore_temporary_from_file),
     ]
diff --git a/tests/test_dump_restore_temporary.py b/tests/test_dump_restore_temporary.py
index c1b896d..558672a 100644
--- a/tests/test_dump_restore_temporary.py
+++ b/tests/test_dump_restore_temporary.py
@@ -4,7 +4,7 @@ import sqlite3

 @pytest.mark.asyncio
-async def test_dump_restore_temporary(tmpdir):
+async def test_dump_temporary_to_file(tmpdir):
     datasette = Datasette([], memory=True)
     await datasette.invoke_startup()
     # Import CSV into temporary
@@ -29,3 +29,31 @@ async def test_dump_restore_temporary(tmpdir):
     # Check that the backup file has the right stuff
     conn = sqlite3.connect(backup_path)
     assert conn.execute("select * from backup_demo").fetchall() == [("123", "Hello")]
+
+
+@pytest.mark.asyncio
+async def test_restore_temporary_from_file(tmpdir):
+    datasette = Datasette([], memory=True)
+    await datasette.invoke_startup()
+    # Populate backup database
+    backup_path = str(tmpdir / "backup.db")
+    conn = sqlite3.connect(backup_path)
+    conn.execute("create table restored (id integer primary key)")
+    conn.execute("insert into restored (id) values (1)")
+    conn.execute("insert into restored (id) values (2)")
+    conn.execute("insert into restored (id) values (3)")
+    response = await datasette.client.get("/temporary/restored")
+    assert response.status_code == 404
+    # Restore it
+    response = await datasette.client.post(
+        "/-/restore-temporary-from-file",
+        json={"path": backup_path},
+        headers={"Authorization": "Bearer fake-token"},
+        allow_redirects=False,
+    )
+    assert response.status_code == 200
+    assert response.json() == {"ok": True, "path": backup_path}
+    # Check the restore
+    response2 = await datasette.client.get("/temporary/restored.json?_shape=array")
+    assert response2.status_code == 404
+    assert response2.json() == {}

But when I run the tests I get this failure:

  File "/Users/simon/Dropbox/Development/datasette-app-support/datasette_app_support/__init__.py", line 214, in restore_temporary_from_file
    backup_db.execute("vacuum into 'file:temporary?mode=memory&cache=shared'")
MemoryError
============================================== short test summary info ==============================================
FAILED tests/test_dump_restore_temporary.py::test_restore_temporary_from_file - assert 500 == 200
simonw commented 3 years ago

Hunch: maybe that vacuum into 'file:temporary?mode=memory&cache=shared' is failing because the named-memory database temporary has already been manipulated by other tests?

Solution might be to create a brand new empty in-memory database with a different name, restore into it, then remove temporary from the datasette.databases list and replace it with that new one instead.

simonw commented 3 years ago

Possibly relevant Stack Overflow: https://stackoverflow.com/questions/58315943/system-data-sqlite-vacuum-into-an-in-memory-database

simonw commented 3 years ago

Since this won't ever run on Python 3.6 I can try the new backup() method from Python 3.7: https://docs.python.org/3/library/sqlite3.html#sqlite3.Connection.backup

simonw commented 3 years ago

This is the weirdest thing... I tried using the backup() method instead and it copied the schema of the database but not the rows for some reason!

diff --git a/datasette_app_support/__init__.py b/datasette_app_support/__init__.py
index 3f99646..ed78962 100644
--- a/datasette_app_support/__init__.py
+++ b/datasette_app_support/__init__.py
@@ -201,6 +201,35 @@ async def dump_temporary_to_file(request, datasette):
     return Response.json({"ok": True, "path": filepath})

+async def restore_temporary_from_file(request, datasette):
+    if not check_auth(request):
+        return unauthorized
+    try:
+        filepath = await _filepath_from_json_body(request)
+    except PathError as e:
+        return Response.json({"ok": False, "error": e.message}, status=400)
+    temporary = datasette.get_database("temporary")
+    backup_db = sqlite3.connect(filepath, uri=True)
+    temporary_conn = temporary.connect(write=True)
+    backup_db.backup(temporary_conn, name="main")
+    backup_tables = [
+        r[0]
+        for r in backup_db.execute(
+            "select name from sqlite_master where type='table'"
+        ).fetchall()
+    ]
+    temporary_conn.close()
+    backup_db.close()
+    return Response.json(
+        {
+            "ok": True,
+            "path": filepath,
+            "restored_tables": await temporary.table_names(),
+            "backup_tables": backup_tables,
+        }
+    )
+
+
 @hookimpl
 def register_routes():
     return [
@@ -210,4 +239,5 @@ def register_routes():
         (r"^/-/import-csv-file$", import_csv_file),
         (r"^/-/auth-app-user$", auth_app_user),
         (r"^/-/dump-temporary-to-file$", dump_temporary_to_file),
+        (r"^/-/restore-temporary-from-file$", restore_temporary_from_file),
     ]
diff --git a/tests/test_dump_restore_temporary.py b/tests/test_dump_restore_temporary.py
index c1b896d..7f61060 100644
--- a/tests/test_dump_restore_temporary.py
+++ b/tests/test_dump_restore_temporary.py
@@ -1,10 +1,10 @@
 from datasette.app import Datasette
+from datasette.utils import sqlite3
 import pytest
-import sqlite3

 @pytest.mark.asyncio
-async def test_dump_restore_temporary(tmpdir):
+async def test_dump_temporary_to_file(tmpdir):
     datasette = Datasette([], memory=True)
     await datasette.invoke_startup()
     # Import CSV into temporary
@@ -29,3 +29,37 @@ async def test_dump_restore_temporary(tmpdir):
     # Check that the backup file has the right stuff
     conn = sqlite3.connect(backup_path)
     assert conn.execute("select * from backup_demo").fetchall() == [("123", "Hello")]
+
+
+@pytest.mark.asyncio
+async def test_restore_temporary_from_file(tmpdir):
+    datasette = Datasette([], memory=True)
+    await datasette.invoke_startup()
+    # Populate backup database
+    backup_path = str(tmpdir / "backup.db")
+    conn = sqlite3.connect(backup_path)
+    conn.execute("create table backup_restored (id integer primary key)")
+    conn.execute("insert into backup_restored (id) values (1)")
+    conn.execute("insert into backup_restored (id) values (2)")
+    conn.execute("insert into backup_restored (id) values (3)")
+    conn.close()
+    response = await datasette.client.get("/temporary/backup_restored")
+    assert response.status_code == 404
+    # Restore it
+    response = await datasette.client.post(
+        "/-/restore-temporary-from-file",
+        json={"path": backup_path},
+        headers={"Authorization": "Bearer fake-token"},
+        allow_redirects=False,
+    )
+    assert response.status_code == 200
+    assert response.json() == {
+        "ok": True,
+        "path": backup_path,
+        "restored_tables": ["backup_restored"],
+        "backup_tables": ["backup_restored"],
+    }
+    # Check the restore
+    response2 = await datasette.client.get("/temporary/backup_restored.json?_shape=array")
+    assert response2.status_code == 200
+    assert response2.json() == None

That last line deliberately fails but shows that /temporary/backup_restored.json?_shape=array returns [] when in should be returning an array of three items.

simonw commented 3 years ago

Testing manually works fine, it's just the test that is failing.

In one terminal:

DATASETTE_API_TOKEN=xyz datasette . -p 8006 

Then create /tmp/restoreme.db like this:

echo "create table foo (id integer primary key); insert into foo (id) values (1);" | sqlite3 /tmp/restoreme.db

Then:

curl -X POST http://localhost:8006/-/restore-temporary-from-file \
  --data '{"path": "/tmp/restoreme.db"}' -H 'authorization: Bearer xyz'

At which point http://127.0.0.1:8006/temporary/foo shows the restored data.

simonw commented 3 years ago

I've spent enough time on this already. I'm going to have the automated test check for the table but not for its contents - if this causes problems in the future I can address that in a new issue.