Open gwk opened 1 year ago
Your comment introduced me to this issue in sqlite and to the ctypes
module - thanks!
I also hope that the datasette developers will enable this mode in a test environment [...] perhaps we could figure out how to invoke it using
ctypes
I'm not a Datasette developer, but I am curious to learn more about getting unholy access to the sqlite C APIs inside of Datasette. (Such access could also help #1293, and if done without grovelling inside of pysqlite's Connection object for the db handle, could even be relatively safe.)
I experimented a bit. I came up with https://gist.github.com/cldellow/85bba507c314b127f85563869cd94820
If you run python3 enable-strict-quoting-sqlite3.py
, it seems to set those flags correctly -- SELECT "foo"
fails where it would normally succeed.
But if you put it in a plugins/
dir and run datasette --plugins-dir plugins/
, it segfaults when it tries to call sqlite3_db_config
on the connections created by Datasette.
I am... confused. I'm pretty sure I'm using the same python and the same libsqlite3 in both scenarios, so I would expect it to work.
@gwk do you know anything that might help me debug the segfault? I gather that my approach of going grovelling inside of a PyObject
is particularly dangerous, but I was thinking (a) it's necessary in order to test Datasette's use of the sqlite3 library and (b) even if it's not portable, it'd be good enough for running the tests on a single machine.
@cldellow glad to hear you tried it, as I got grossed out by my own suggestion ;) If you are on macOS I do have one trick for debugging segfaults using lldb.
I'm on Ubuntu, unfortunately. :( Would it still be relevant?
I think I've narrowed things down a bit more.
Even sqlite3_free(sqlite3_malloc(128))
segfaults -- this suggests to me that it's something about the sqlite3 library that was loaded, vs, say, getting the wrong db handle when I go spelunking in the Connection object.
Aha, it's user error on my part.
Adding
sqlite3_db_config.argtypes = [ctypes.c_void_p, ctypes.c_int, ctypes.c_int, ctypes.c_int]
makes it work reliably both on the CLI and from datasette, and now I can reproduce the errors you mentioned in the issue description.
I have linked Python3.11 on macOS against recent SQLite that was compiled using
-DSQLITE_DQS=0
. This option disables interpretation of double-quoted identifiers as string literals, described in the SQLite docs as a "MySQL 3.x misfeature". See https://www.sqlite.org/quirks.html#dblquote for background.Datasette uses the double-quote syntax in a number of key places, and is thus completely broken in this environment.
My experience was to
pip install datasette
, then rundatasette serve -I my-data.db
. When I visithttp://127.0.0.1:8001
I get a 500 response.The error:
sqlite3.OperationalError: no such column: geometry_columns
The responsible SQL:
'select 1 from sqlite_master where tbl_name = "geometry_columns"'
I then installed datasette from GitHub master in development mode and changed the offending SQL to use correct quotes:
"select 1 from sqlite_master where tbl_name = 'geometry_columns'"
.With this change, I get a little further, but have the same problem with the first table name in my database (in my case, "Meta"):
I will try to continue playing with this, but I also hope that the datasette developers will enable this mode in a test environment as I am unlikely to be able to exercise all of the SQL in the codebase, or make a pull request very soon.
Note that the DQS setting compile-time option can be overridden at runtime with calls to the C API:
As far as I can tell,
sqlite3_db_config
is not exposed in Python, but perhaps we could figure out how to invoke it usingctypes
.