simonw / sqlite-utils

Python CLI utility and library for manipulating SQLite databases
https://sqlite-utils.datasette.io
Apache License 2.0
1.58k stars 106 forks source link

Enable FTS for attached dbs #609

Open petergaultney opened 6 months ago

petergaultney commented 6 months ago

This is just a work-in-progress, although it does:

All tests pass locally, as well as black and mypy.

You could call this a proof-of-concept; I think there probably would need to be discussions about some of my refactors around triggers, counts, etc, to decide whether we want to take a similar approach to those (modify the 'main' database rather than the attached ones, under the reasoning that someone attaching a database could presumably make separate connections to that database if they wanted to set up triggers, count tables, etc), do something else, or possibly try to just not support anything having to do with attached databases/tables in those cases.

This would close #608


2024-01-04: Only now am I realizing, based on the content of your tests, that you have use cases where you want to support dots being part of an actual table name. And that this means my approach is not workable, since the API would not allow us to distinguish between cases where the user intended to match an existing, attached database name, and cases where the user wanted to put that matching string in a table name inside main.

Hopefully the idea of supporting this is compelling enough that we can figure out what a better API (and internal storage format) might be. Perhaps we could use tuples ("a1", "foo") and force users to be explicit when creating/accessing the tables from the Database object, and then store the database name in a separate field inside the Queryable object?

I really liked the simplicity of a simple string (it works well for the CLI use cases as well, although maybe a CLI would never attach a database in the first place, since that's a stateful operation) -- but I am sure you don't want to break backward compatibilty.


2024-01-09: I have reworked this so that the above section is no longer an issue. The schema_name is stored explicitly within the Table object, and the only real change to functionality is that table_names() and related methods on Database will no longer operate across schemas - they instead operate on a single schema (attached database) at a time.

codecov[bot] commented 5 months ago

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (17eb818) 95.69% compared to head (61de8d9) 95.56%.

Files Patch % Lines
sqlite_utils/db.py 93.15% 5 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #609 +/- ## ========================================== - Coverage 95.69% 95.56% -0.13% ========================================== Files 8 8 Lines 2854 2886 +32 ========================================== + Hits 2731 2758 +27 - Misses 123 128 +5 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.