Wire protocol limitations as a backend for the C client

cole-miller commented 1 year ago

The current wire protocol has a four-way distinction for requests that invoke SQL statements:

QUERY
QUERY_SQL
EXEC
EXEC_SQL

EXEC and EXEC_SQL requests cannot return rows, and they're implemented differently by the server. I believe this split is inspired by the DB and Stmt interfaces in the Go stdlib's database/sql module, and it works well as a backend for implementing those interfaces (see the driver module in go-dqlite). But as a backend for implementing a C client library that resembles the sqlite3 API, they have some problems. In particular, consider the problem of implementing these functions:

int dqlite_prepare(dqlite *db, const char *sql, dqlite_stmt **stmt);
int dqlite_step(dqlite_stmt *stmt);

Presumably dqlite_prepare should be implemented by sending a PREPARE request and storing the statement ID in the dqlite_stmt object. Then the first call to dqlite_step needs to send a request to execute the prepared statement -- but the client doesn't have enough information to determine whether the statement will yield any rows. The conservative thing to do is to send QUERY and handle the possibility of zero rows being returned, but the server code is not currently set up to handle something like QUERY("CREATE TABLE foo (n INT)"):

https://github.com/canonical/dqlite/blob/e444efc69516e1adf0abc49dec58ab06d0db371c/src/query.c#L100

@MathieuBordere suggested that we should just add a new all-purpose request type that supports returning rows but without the baked-in assumptions of QUERY, which seems reasonable to me. The alternative would be to try to relax the implementation of QUERY; that would need careful testing to make sure we don't break anything that go-dqlite depends on.

cole-miller commented 1 year ago

Specifically, running (in pseudocode):

stmt = PREPARE("CREATE TABLE foo (n INT)");
QUERY(stmt);

works fine, but afterward attempting to prepare a statement like SELECT * FROM foo will fail with "no such table". I haven't figured out why this is -- sqlite3_step does get called on the CREATE statement and returns SQLITE_DONE, which should also be what happens when running an EXEC request.

cole-miller commented 1 year ago

Oh, it's because the QUERY codepath doesn't do a VfsPoll to propagate any changes to the database.

MathieuBordere commented 1 year ago

Oh, it's because the QUERY codepath doesn't do a VfsPoll to propagate any changes to the database.

Yeah, it's "read only" and doesn't go through raft.

cole-miller commented 1 year ago

I think it should be possible to implement the new request (pre-bikeshed name: GENEXEC) to reuse the existing code in leaderExecV2 and query__batch -- experimenting with that now.

freeekanayaka commented 1 year ago

One other option that might be a bit easier both for the client and for the server is to modify the response the PREPARE request adding a flag indicating if the statement is read-only or not. That flag can be stored in the dqlite_stmt object along with the statement ID, and subsequent calls to dqlite_step will invoke the correct API (either QUERY or EXEC) based on that flag.

The sqlite3_stmt_readonly SQLite function can be used by the server when processing the PREPARE request to determine the value of the flag to include in the response.

YMMV

canonical / dqlite

Wire protocol limitations as a backend for the C client #471