simonw / sqlite-utils

Python CLI utility and library for manipulating SQLite databases
https://sqlite-utils.datasette.io
Apache License 2.0
1.64k stars 111 forks source link

Automatic JSON de-serialization #612

Open rsyring opened 8 months ago

rsyring commented 8 months ago

My use case is that I'd like to more easily import and explore JSON documents returned from APIs. Obviously, the automatic type detection and table creation provided by sqlite-utils (SU) adds considerable productivity to that effort.

One of the difficulties I've encountered is that SU will serialize dicts/lists to JSON but doesn't deserialize it when retrieved. It turns out that with a bit of modification, it would not be hard to support this use case. This is what I came up with:

import json
import sqlite3

from sqlite_utils import db

db.COLUMN_TYPE_MAPPING['JSON'] = 'JSONB'

class Database(db.Database):
    def __init__(self, fpath, *args, **kwargs):
        conn = sqlite3.connect(fpath, detect_types=sqlite3.PARSE_DECLTYPES)
        sqlite3.register_converter('jsonb', self.json_loads)

        super().__init__(conn, *args, **kwargs)

    # Keep this on the class so it can be easily customized in a subclass
    def json_loads(self, val: bytes):
        return json.loads(val)

db = Database(':memory:')

db['users'].insert(
    {'id': 1, 'name': 'John Doe', 'preferences': {'theme': 'dark', 'language': 'en'}},
    columns={'preferences': 'JSON'},
    replace=True,
)

row = db['users'].get(1)
assert isinstance(row['preferences'], dict), row

This could all be made mostly automatic if SU:

  1. Column type detection used 'JSON' instead of 'TEXT' when detecting Python objects that will be jsonified
  2. sqlite3 is setup to handle JSON/JSONB conversion

Considerations:

  1. I'd like to see this made the default but until the next major version bump, could be hidden behind Database(..., jsonb_columns=True)
  2. Whether or not JSON or JSONB columns are used would depend on SQLite version.