simonw / datasette

An open source multi-tool for exploring and publishing data
https://datasette.io
Apache License 2.0
9.06k stars 648 forks source link

What minimal SQLite version should Datasette support? #2349

Open asg017 opened 2 weeks ago

asg017 commented 2 weeks ago

Datasette core's SQLite usage is pretty tame: CREATE TABLE, simple selects, update/insert, etc. I don't think we need to use a lot of fancy new SQLite features. Except maybe UPSERt.

SQLite release history: https://www.sqlite.org/changes.html

Scrolling through and thinking about all the SQLite-related version bugs, the only thing I can think of is not having UPSERTs in older SQLite versions.

Security?

Could we say "SQLite version's below 3.XX have security issues so you must upgrade"? https://www.sqlite.org/cves.html

The only "real" recent SQLite CVE is maybe from 3.39.2 with a buffer overflow, but that's a C API thing and not exploitable by Datasette afaik

simonw commented 2 weeks ago

Had a great suggestion here: https://chaos.social/@djh/112594380456382194

I believe you could quickly check this by running docker images e.g.

docker run -it --rm debian:...
docker run -it --rm ubuntu:...

with different distributions and their releases and see what they ship!

This would allow you to unblock yourself without having to wait for folks to report in 🙌

Also: https://hachyderm.io/@zack/112594435807681224

And if you don't want to download a ton of image layers: pkgs.org/search/?q=sqlite

simonw commented 2 weeks ago

The biggest question to answer here is if there are any widely used distros that we need to support - like Red Hat Enterprise Linux - which ship with an older version of SQLite but still include a supported version of Python.

simonw commented 2 weeks ago

Decision: we're going to go with the first release that added window functions, and tell anyone with older versions to use pysqlite3.

3.25 (2018-09-15)

With the JSON extension enabled (it's enabled by default from 2022-02-22 (3.38.0))

simonw commented 2 weeks ago

Automated tests: let's test Mac an Ubuntu against all supported Python's against SQLite minimum version AND SQLite latest version.

simonw commented 2 weeks ago

Our earliest supported version can be downloaded from https://www.sqlite.org/2018/sqlite-amalgamation-3250300.zip

simonw commented 2 weeks ago

Following https://til.simonwillison.net/sqlite/sqlite-version-macos-python

Built the macOS dylib like this:

cd /tmp
wget 'https://www.sqlite.org/2018/sqlite-amalgamation-3250300.zip'
unzip sqlite-amalgamation-3250300.zip
cd sqlite-amalgamation-3250300
gcc -dynamiclib sqlite3.c -o libsqlite3.0.dylib -lm -lpthread \
  -DSQLITE_ENABLE_FTS3 \
  -DSQLITE_ENABLE_FTS3_PARENTHESIS \
  -DSQLITE_ENABLE_FTS4 \
  -DSQLITE_ENABLE_FTS5 \
  -DSQLITE_ENABLE_JSON1 \
  -DSQLITE_ENABLE_RTREE

Then run Datasette against it like this:

DYLD_LIBRARY_PATH=/tmp/sqlite-amalgamation-3250300 datasette --get /-/versions.json | jq

And got:

{
  "python": {
    "version": "3.10.14",
    "full": "3.10.14 (main, Mar 19 2024, 21:46:16) [Clang 15.0.0 (clang-1500.3.9.4)]"
  },
  "datasette": {
    "version": "1.0a13"
  },
  "asgi": "3.0",
  "uvicorn": "0.20.0",
  "sqlite": {
    "version": "3.25.3",
    "fts_versions": [
      "FTS5",
      "FTS4",
      "FTS3"
    ],
    "extensions": {
      "json1": null
    },
    "compile_options": [
      "COMPILER=clang-15.0.0",
      "ENABLE_FTS3",
      "ENABLE_FTS3_PARENTHESIS",
      "ENABLE_FTS4",
      "ENABLE_FTS5",
      "ENABLE_JSON1",
      "ENABLE_RTREE",
      "THREADSAFE=1"
    ]
  }
}
simonw commented 2 weeks ago

Here's the recipe to get that version of SQLite with Python in an Ubuntu Docker container. First save this as script.sh:

#!/bin/bash

export DEBIAN_FRONTEND=noninteractive

apt-get update && apt-get install -y python-is-python3 wget build-essential unzip

wget https://www.sqlite.org/2018/sqlite-amalgamation-3250300.zip
unzip sqlite-amalgamation-3250300.zip
cd sqlite-amalgamation-3250300

gcc -fPIC -shared -o libsqlite3.so sqlite3.c -lm -lpthread \
  -DSQLITE_ENABLE_FTS3 \
  -DSQLITE_ENABLE_FTS3_PARENTHESIS \
  -DSQLITE_ENABLE_FTS4 \
  -DSQLITE_ENABLE_FTS5 \
  -DSQLITE_ENABLE_JSON1 \
  -DSQLITE_ENABLE_RTREE

python -c 'import sqlite3; print(
  sqlite3.connect(":memory:").execute("select sqlite_version()").fetchall()
)'

Then run this (for an Apple Silicon device, hence the --platform linux/arm64/v8):

docker run --rm -it --platform linux/arm64/v8 -v $(pwd)/script.sh:/script.sh ubuntu bash -c 'chmod +x /script.sh && /script.sh; exec bash'

This drops into an interactive shell after the software has been installed.

I see this:

Get:1 http://ports.ubuntu.com/ubuntu-ports noble InRelease [256 kB]
Get:2 http://ports.ubuntu.com/ubuntu-ports noble-updates InRelease [126 kB]
Get:3 http://ports.ubuntu.com/ubuntu-ports noble-backports InRelease [126 kB]
...
Fetched 22.5 MB in 1min 36s (235 kB/s)                                                                                        
Reading package lists... Done
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  adduser binutils binutils-aarch64-linux-gnu binutils-common bzip2 ca-certificates cpp cpp-13 cpp-13-aarch64-linux-gnu
...
0 upgraded, 133 newly installed, 0 to remove and 6 not upgraded.
Need to get 99.6 MB of archives.
After this operation, 372 MB of additional disk space will be used.
...
Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
done.
--2024-06-13 22:12:21--  https://www.sqlite.org/2018/sqlite-amalgamation-3250300.zip
Resolving www.sqlite.org (www.sqlite.org)... 45.33.6.223, 2600:3c00::f03c:91ff:fe96:b959
Connecting to www.sqlite.org (www.sqlite.org)|45.33.6.223|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2271277 (2.2M) [application/zip]
Saving to: 'sqlite-amalgamation-3250300.zip'

sqlite-amalgamation-3250300.zip 100%[======================================================>]   2.17M  3.23MB/s    in 0.7s    

2024-06-13 22:12:22 (3.23 MB/s) - 'sqlite-amalgamation-3250300.zip' saved [2271277/2271277]

Archive:  sqlite-amalgamation-3250300.zip
   creating: sqlite-amalgamation-3250300/
  inflating: sqlite-amalgamation-3250300/sqlite3ext.h  
  inflating: sqlite-amalgamation-3250300/sqlite3.c  
  inflating: sqlite-amalgamation-3250300/sqlite3.h  
  inflating: sqlite-amalgamation-3250300/shell.c  
...
[('3.25.3',)]

It can take a few minutes though, so would be good to optimize this.

simonw commented 2 weeks ago

Then in the shell I need to do this:

LD_PRELOAD=sqlite-amalgamation-3250300/libsqlite3.so python -c 'import sqlite3; print(
  sqlite3.connect(":memory:").execute("select sqlite_version()").fetchall()
)'
simonw commented 2 weeks ago

Then to run the Datasette tests against that SQLite version inside the container:

apt-get install -y python3-pip git python3.12-venv
git clone https://github.com/simonw/datasette
cd datasette/
python -m venv venv
source venv/bin/activate
pip install -e '.[test]'
LD_PRELOAD=/sqlite-amalgamation-3250300/libsqlite3.so pytest tests/test_api_write.py  

Annoyingly the tests pass for me - refs #2356

platform linux -- Python 3.12.3, pytest-8.2.2, pluggy-1.5.0
SQLite: 3.25.3
rootdir: /datasette
configfile: pytest.ini
plugins: timeout-2.3.1, asyncio-0.23.7, anyio-4.4.0, xdist-3.6.1
asyncio: mode=Mode.STRICT
collected 114 items                                                                                                           

tests/test_api_write.py ............................................................................................... [ 83%]
.........XX........                                                                                                     [100%]

=============================================== 112 passed, 2 xpassed in 2.17s ================================================