splitgraph / seafowl

Analytical database for data-driven Web applications 🪶
https://seafowl.io
Apache License 2.0
418 stars 11 forks source link

Fix SQLite race condition causing new DBs/schemas to not be immediately available after a HTTP 200 #372

Closed mildbyte closed 1 year ago

mildbyte commented 1 year ago

With an on-disk SQLite table, this rename sometimes doesn't take effect by the time we select from test2.test. This doesn't happen with an in-memory SQLite:

$ node ../seafowl/examples/clients/node/seafowl-client.js -f qall.sql
code: 200
[
    {
        "table_catalog": "default",
        "table_name": "test",
        "table_schema": "test1",                    <============== still in test1 (the original schema)
        "table_type": "BASE TABLE"
    },
    {
        "table_catalog": "default",
        "table_name": "table_versions",
        "table_schema": "system",
        "table_type": "VIEW"
    },
    {
        "table_catalog": "default",
        "table_name": "table_partitions",
        "table_schema": "system",
        "table_type": "VIEW"
    },
    {
        "table_catalog": "default",
        "table_name": "tables",
        "table_schema": "information_schema",
        "table_type": "VIEW"
    },
    {
        "table_catalog": "default",
        "table_name": "views",
        "table_schema": "information_schema",
        "table_type": "VIEW"
    },
    {
        "table_catalog": "default",
        "table_name": "columns",
        "table_schema": "information_schema",
        "table_type": "VIEW"
    },
    {
        "table_catalog": "default",
        "table_name": "df_settings",
        "table_schema": "information_schema",
        "table_type": "VIEW"
    }
]
 ~/seafowl-repro $ cat qall.sql 
CREATE SCHEMA test1;
CREATE SCHEMA test2;
CREATE TABLE test1.test (col_1 INTEGER, col_2 VARCHAR);
ALTER TABLE test1.test RENAME TO test2.test;
SELECT * FROM information_schema.tables;
--SELECT * FROM test2.test;

Causing this to fail:

$ cat qall.sql 
CREATE SCHEMA test1;
CREATE SCHEMA test2;
CREATE TABLE test1.test (col_1 INTEGER, col_2 VARCHAR);
ALTER TABLE test1.test RENAME TO test2.test;
SELECT * FROM test2.test;

With a write statement between the rename and the select, this starts working again:

 ~/seafowl-repro $ node ../seafowl/examples/clients/node/seafowl-client.js -f qall.sql
code: 200
 ~/seafowl-repro $ cat qall.sql 
CREATE SCHEMA test1;
CREATE SCHEMA test2;
CREATE TABLE test1.test (col_1 INTEGER, col_2 VARCHAR);
ALTER TABLE test1.test RENAME TO test2.test;
CREATE TABLE test1.dummy(col_1 INTEGER);  -- << if i run this write statement, the next line shows the new schema correctly
--SELECT * FROM information_schema.tables;
SELECT * FROM test2.test;

We also noticed the SQLite DB is open twice:

$ sudo lsof -p 1111165 | grep sqlite
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs
      Output information may be incomplete.
lsof: WARNING: can't stat() fuse.portal file system /run/user/1000/doc
      Output information may be incomplete.
seafowl 1111165 mildbyte  mem-r     REG                8,1    32768  4691303 /home/mildbyte/seafowl-repro/seafowl-data/seafowl.sqlite-shm
seafowl 1111165 mildbyte    9ur     REG                8,1     4096  4691300 /home/mildbyte/seafowl-repro/seafowl-data/seafowl.sqlite
seafowl 1111165 mildbyte   10u      REG                8,1   164832  4691301 /home/mildbyte/seafowl-repro/seafowl-data/seafowl.sqlite-wal
seafowl 1111165 mildbyte   11ur     REG                8,1    32768  4691303 /home/mildbyte/seafowl-repro/seafowl-data/seafowl.sqlite-shm
seafowl 1111165 mildbyte   12ur     REG                8,1     4096  4691300 /home/mildbyte/seafowl-repro/seafowl-data/seafowl.sqlite
seafowl 1111165 mildbyte   13u      REG                8,1   164832  4691301 /home/mildbyte/seafowl-repro/seafowl-data/seafowl.sqlite-wal

We might be able to solve it by only using one SQLite connection (it doesn't matter for SQLite anyway):

diff --git a/src/repository/sqlite.rs b/src/repository/sqlite.rs
index bfbd2b9..ef124b1 100644
--- a/src/repository/sqlite.rs
+++ b/src/repository/sqlite.rs
@@ -72,7 +72,7 @@ impl SqliteRepository {
             .create_if_missing(true)
             .journal_mode(journal_mode);

-        let pool = SqlitePoolOptions::new().connect_with(options).await?;
+        let pool = SqlitePoolOptions::new().max_connections(1).connect_with(options).await?;
         let repo = Self { executor: pool };
         repo.setup().await;
         Ok(repo)
gruuya commented 1 year ago

The minimal fix above has been implemented in #367, so closing this for now.