coleifer / sqlite-web

Web-based SQLite database browser written in Python
MIT License
3.35k stars 331 forks source link

peewee.OperationalError: unknown tokenizer: mozporter when opening Thunderbirds global-messages.sqlite #138

Closed WolfgangFahl closed 9 months ago

WolfgangFahl commented 9 months ago

I am trying

sqlite_web global-messages-db.sqlite 

and get

  File "/Users/wf/Library/Python/3.10/lib/python/site-packages/peewee.py", line 3197, in execute_sql
    cursor.execute(sql, params or ())
peewee.OperationalError: unknown tokenizer: mozporter

i have searched the web for a solution but didn't find any.

https://sqlitebrowser.org/dl/ can open the database.

coleifer commented 9 months ago

Well without the full traceback this was really not helpful as to debugging where exactly sqlite-web is running into problems with the missing tokenizer. Even if sqlite-browser can open the database file, I'm dubious whether you can actually query the full-text index table without the custom tokenizer.

In order to query the full-text search tables, you'll need Mozilla's custom mozporter shared library. It doesn't come offered as a standalone extension, but it was fairly easy to compile one from the available sources: https://github.com/mozilla/releases-comm-central/tree/master/mailnews/extensions/fts3

To install the tokenizer automatically in a loadable extension, you'll want to modify the module registration function to register the tokenizer directly, e.g.

    /* no error checking, just an example */
    sqlite3_stmt *pStmt;
    const char zSql[] = "SELECT fts3_tokenizer(?, ?)";

    sqlite3_prepare_v2(db, zSql, -1, &pStmt, 0);

    const static sqlite3_tokenizer_module* module;
    sqlite3Fts3PorterTokenizerModule(&module);
    sqlite3_bind_text(pStmt, 1, "mozporter", -1, SQLITE_STATIC);
    sqlite3_bind_blob(pStmt, 2, &module, sizeof(module), SQLITE_STATIC);
    sqlite3_step(pStmt);
    sqlite3_finalize(pStmt);

I built such an extension locally and it works just fine:

sqlite> .load ./libmozporter
sqlite> CREATE VIRTUAL TABLE documents USING fts3(id, title, content, tokenize=mozporter);
sqlite> INSERT INTO documents ("title", "content") VALUES ('The title', 'This is the content of the message');
sqlite> SELECT title, rank(matchinfo(documents)) from documents where documents match 'contents';
The title|1.0

In order to get it working with the loadable extension support in sqlite-web I did need to make a modification to ensure extensions are loaded before introspection occurs (see: ed80c347ae1890607addf2af1c0c46f284e03a72).

Once that was done, though, I had no problem running sqlite-web with the loadable extension, and viewing a FTS table using the mozporter tokenizer:

im-1702481989060 im-1702482028044