pkulchenko / fullmoon

Fast and minimalistic Redbean-based Lua web framework in one file.
MIT License
684 stars 30 forks source link

Usage of makeStorage in requests #28

Closed mierenhoop closed 12 months ago

mierenhoop commented 1 year ago

Hi, I'm confused how makeStorage should be used. When creating a database instance with makeStorage in .init.lua a SQLite connection will be opened and kept open until redbean is stopped.

local fm = require"fullmoon"
local db = fm.makeStorage("test.db", "-- initialize...")

Now, using db in a route, the SQLite connection will be the same, even while the current process is forked off the main process.

fm.setRoute("/", function(r)
  db:execute(...) -- silently works but not valid?
  ...
end)

When adding <close> to db, a nice error message is shown when requesting /, because the database is closed before forking. Now we have to create a new storage for a new SQLite connection:

fm.setRoute("/", function(r)
  local newdb <close> = fm.makeStorage("test.db")
  newdb:execute(...) -- safe
  ...
end)

Is this how the database management layer should be used? While it is a working solution, I'd rather have it either

  1. Give an error when the database connection is used outside of the process it was created in (using unix.getpid?).
  2. Have the makeStorage instance decide for itself when it should create/close connections, again using the process id, freely allowing the user to access it anywhere.
  3. Have a method :clone() or similar:
    fm.setRoute("/", function(r)
    local newdb <close> = db:clone()
    end)

    All of these proposals will bring additional complexity, I can imagine that's not desired... Thanks.

pkulchenko commented 1 year ago

@mierenhoop, thank you for the good summary of the issue and the proposed solution. I actually have a working implementation for (2) on your list, but it has a couple of complications that stopped me from releasing it just yet:

  1. The implementation does exactly what you're describing, as it will re-open the connection if it's opened (originally) from a different pid, but it does it in all cases, even though it's not needed when the original connection is opened as read-only one. I don't have a good way to detect that through the available API, although I can probably add db_readonly method for that (to redbean).
  2. Closing connections create issues on windows (due to some interplay of redbean using cosmopolitan's implementation for sqlite3 and not using its Windows one). I have a simple test script that demonstrates the issue and have been trying to unsuccessfully identify the cause and fix it. See this SQLite forum discussion for a script and details.

Another option is to use a statement-level check and only reset if it's a non-readonly statement. This API also need to be added to redbean, but I can do both at the same time.

If you're interested, I may still push the implementation to a separate branch over the next couple of days to see if it helps with what you're trying to do.

mierenhoop commented 1 year ago

Ah, good to hear that you've considered solutions already. I'm not in a hurry for this feature, so I'd say add it whenever you're ready. Thanks.

pkulchenko commented 1 year ago

@mierenhoop, I pushed the current changes to this branch: https://github.com/pkulchenko/fullmoon/tree/forkable-storage

When adding to db, a nice error message is shown when requesting /, because the database is closed before forking.

I plan to add a check for a closed DB to reset it in this case; let me know if you think it's the expected behavior.

I also plan to only reset the connection if it's not readonly, but it requires a patch to redbean to add (I submitted a PR).

With all these changes in place the connection is going to be reset if (1) pid is different, but DB is not readonly or (2) DB is closed. I'm also thinking that it may be worth checking if the statement itself is read-only, but this will complicate the logic and may not be as effective, as it's likely that some of the queries do modify something and if not, you can always open a read-only connection, which doesn't need to be reset. I have to think about this one.

Scratch (2), as it's not needed to check for that: if pid is different, it's going to be reset anyway and if it's the same, then it doesn't need to be reset.

So, reset if pid is different and (DB is not readonly and (statement is a string or statement is not readonly)). This allows to use a read-only connection without resetting OR to use a regular connection with read-only (prepared) statements.

mierenhoop commented 1 year ago

The commits are looking great!

I plan to add a check for a closed DB to reset it in this case; let me know if you think it's the expected behavior.

I'm not sure about this, as it would be abstracting away the lifetime of database connections, but I suppose in the case of fullmoon the pros (not having to think about connections) outweighs the cons (having a connection open for too long?). I do think it's good practice to allow the user to manually handle connection lifetimes (my 3rd proposal), although by default handling them automatically.

So, reset if pid is different and (DB is not readonly and (statement is a string or statement is not readonly)). This allows to use a read-only connection without resetting OR to use a regular connection with read-only (prepared) statements.

Right, assuming a read-only connection can in fact be safely kept open (I don't know about SQLite internals).

Definitely looking forward to using fullmoon, thanks.

pkulchenko commented 1 year ago

Ok, slight update to the logic: (1) I'm adding a check for in-memory databases, as they shouldn't be reset (as they won't keep their content) and don't need to be reset (as there is no conflict) and (2) I'm removing a check for statements, as it requires a check for a statement to not be in a transaction, which complicates the checks (and requires yet another function not currently available in the binding). Here is the current logic: reset if pid is different and (not dbm.readonly or not dbm.readonly()) and dbm:db_filename() ~= "") (the last check is for in-memory or temp DB).

pkulchenko commented 1 year ago

@mierenhoop, I pushed several more changes to the branch along with some tests. I think this logic is complete. I'll test it a bit more tomorrow, but I think it should be good to go, so let me know if you get a chance to test it.

The redbean PR has been merged, so I plan to merge these changes too (although they should work with or without the PR, but will re-open even readonly DBs, since this can't be checked in older version of redbean).

mierenhoop commented 1 year ago

Having tested your changes, I'm pleased with how it works now. (As a side note, I've tried tracking when connections close: when a forked process is killed, sqlite's close will not be called but the file handle will be released by the OS. I'm not sure if that's something to worry about, especially not in WAL mode.)

pkulchenko commented 1 year ago

@mierenhoop, that's great to know, thank you for testing! I had a branch with one more change that actually assigned "close" to the __gc handler, but was hesitant to merge it. The forked connections should not be closed unless they came from the same process (which could otherwise mess up the locks), but I had a small bug in my earlier code that prevented this from working correctly. I suspect that adding this code back should now work properly.

I'll do a bit of testing locally and push one more change later today, which should allow those connections to be closed from forked processes. I suspect that the OS is going to handle it correctly, but would still prefer the connections to be closed as needed (whether by the close or by the gc handler).

pkulchenko commented 12 months ago

@mierenhoop, pushed two more changes, which should now close the DB connections using __gc handlers. It should be good to go, but I'm interested in your experience.

pkulchenko commented 12 months ago

when a forked process is killed, sqlite's close will not be called but the file handle will be released by the OS. I'm not sure if that's something to worry about, especially not in WAL mode.

When the process is killed, there is not much that can be done, but I'd expect the connections to be closed now when the process is (properly) terminated, as the Lua engine will call __gc handlers.

mierenhoop commented 12 months ago

I'd expect the connections to be closed now when the process is (properly) terminated, as the Lua engine will call __gc handlers.

After some testing I did not see any connections close from forked processes, this might not happen because Redbean does not call lua_close from a forked process. When adding lua_close to forked processes in Redbean, __gc will be called and the connections will be closed as expected. I'm curious to know whether this has to be added to Redbean or if it can be worked around in fullmoon...

pkulchenko commented 12 months ago

That's interesting; I wonder if ExitWorker function in redbean should be closing the Lua state (or call LuaDestroy). Are you compiling your own redbean code? I can provide a patch to test.

In terms of doing this in fullmoon, I can add os.exit(true, true) to OnWorkerStop hook, but it will exit the process without allowing redbean to finish a small chunk of what's left in ExitWorker, so I'd prefer it to be done in redbean itself.

Something like this may work:

diff --git a/tool/net/redbean.c b/tool/net/redbean.c
index d6e330a10..c08e0de07 100644
--- a/tool/net/redbean.c
+++ b/tool/net/redbean.c
@@ -6556,6 +6556,7 @@ static int ExitWorker(void) {
       monitorth = 0;
     }
   }
+  LuaDestroy();
   _Exit(0);
 }
mierenhoop commented 12 months ago

In order to test the execution paths I applied the following patches:

diff --git a/tool/net/lsqlite3.c b/tool/net/lsqlite3.c
index e3d9bfda3..cdc91be2e 100644
--- a/tool/net/lsqlite3.c
+++ b/tool/net/lsqlite3.c
@@ -665,6 +665,7 @@ static int cleanupdb(lua_State *L, sdb *db) {

     /* close database; _v2 is intended for use with garbage collected languages
        and where the order in which destructors are called is arbitrary. */
+    puts("Closing");
     result = sqlite3_close_v2(db->db);
     db->db = NULL;

diff --git a/tool/net/redbean.c b/tool/net/redbean.c
index d6e330a10..c08e0de07 100644
--- a/tool/net/redbean.c
+++ b/tool/net/redbean.c
@@ -6556,6 +6556,7 @@ static int ExitWorker(void) {
       monitorth = 0;
     }
   }
+  LuaDestroy();
   _Exit(0);
 }
diff --git a/fullmoon.lua b/fullmoon.lua
index 5baab2c..07b6e49 100644
--- a/fullmoon.lua
+++ b/fullmoon.lua
@@ -671,8 +671,8 @@ local dbmt = { -- share one metatable among all DBM objects
     local db = rawget(t, "db")
     return db and db[k] and function(self,...) return db[k](db,...) end or nil
   end,
-  __gc = function(t) return t:close() end,
-  __close = function(t) return t:close() end
+  __gc = function(t) print"gc" return t:close() end,
+  __close = function(t) print"scope" return t:close() end
 }
 local function makeStorage(dbname, sqlsetup, opts)
   sqlite3 = sqlite3 or require "lsqlite3"
@@ -719,6 +719,7 @@ local function makeStorage(dbname, sqlsetup, opts)

     local skipexec = db == false
     local code, msg
+    print("open", unix.getpid())
     db, code, msg = sqlite3.open(self.name, flags)
     if not db then error(("%s (code: %d)"):format(msg, code)) end
     -- __gc handler on the DB object will close it, which can happen multiple times
@@ -759,6 +760,7 @@ local function makeStorage(dbname, sqlsetup, opts)
     local db = self.db
     if db and self.pid == unix.getpid() then
       self.db = false -- mark it as re-openable
+      print("close", unix.getpid())
       return db:close()
     end
   end

With the Redbean patch applied, the connections closed on worker-process exit. As for exiting in OnWorkerStop, maybe instead we can remove references to the to-be-gc'd connections and afterwards collectgarbage inside OnWorkerStop. However it's probably best to change behavior in Redbean itself.

pkulchenko commented 12 months ago

As for exiting in OnWorkerStop, maybe instead we can remove references to the to-be-gc'd connections and afterwards collectgarbage inside OnWorkerStop.

Possible, but not straightforward and requires explicit tracking.

However it's probably best to change behavior in Redbean itself.

Sounds good; I submitted a redbean PR for this.

mierenhoop commented 12 months ago

Great! With the PR, this issue will be completely solved, thanks for your help.

pkulchenko commented 12 months ago

@mierenhoop, the redbean PR has been merged, so I'll be merging this one as well.

Thanks for the typo fixes; I merged your commit as well!