fnc12 / sqlite_orm

❤️ SQLite ORM light header only library for modern C++
GNU Affero General Public License v3.0
2.19k stars 305 forks source link

FTS5 Options: Unindexed columns, prefix indexes, tokenizers & content #1277

Closed trevornagy closed 3 weeks ago

trevornagy commented 1 month ago

Opening this for visibility, feel free to close if you want this discussion in: #622

Does the implementation FTS5 support unindexed columns, prefix indexes and tokenizers? https://www.sqlite.org/fts5.html#the_unindexed_column_option

What about the content option? https://www.sqlite.org/fts5.html#external_content_and_contentless_tables These are all important features for me, can't see any examples or tests in the fts5 feature branch using them

Thanks!

fnc12 commented 1 month ago

hey Trevor. I'll start implementing this soon. Thanks

fnc12 commented 1 month ago

PR is on its way https://github.com/fnc12/sqlite_orm/pull/1279

fnc12 commented 1 month ago

@trevornagy please check unindexed() column constraint function in dev branch. Other features are on its way and will be merged soon

trevornagy commented 1 month ago

@fnc12 Just tested with the unindexed PR commit on dev, getting the following errors:

1>  error C2039: 'unindexed': is not a member of 'sqlite_orm'
1>  sqlite_orm.h(23342,11):
1>  see declaration of 'sqlite_orm'
1>   error C3861: 'unindexed': identifier not found
1> error C2672: 'sqlite_orm::make_column': no matching overloaded function found
1>  sqlite_orm.h(2994,37):
1>  could be 'sqlite_orm::internal::column_t<G,S,Op...> sqlite_orm::make_column(std::string,G,S,Op...)'
fnc12 commented 1 month ago

@trevornagy unindexed function exists in dev branch. Proof https://github.com/fnc12/sqlite_orm/blob/dev/include/sqlite_orm/sqlite_orm.h#L2183. I bet you messed up with the commits. Just pull dev branch, it should work

trevornagy commented 1 month ago

Sorry! You're correct, I copied the incorrect commit id. It works as expected.

trevornagy commented 1 month ago

Thank you very much for your quick work by the way!

fnc12 commented 1 month ago

PR with prefix is on its way https://github.com/fnc12/sqlite_orm/pull/1282

fnc12 commented 1 month ago

@trevornagy please check new prefix function in dev branch

trevornagy commented 1 month ago

@fnc12 Looked at the API and it looks like you create the prefix inside a make_column? This seems strange to me as each prefix doesn't apply to a column but rather the whole table?

https://www.sqlite.org/fts5.html#prefix_indexes

The test in using_fts5.cpp has the expected as: expected = R"(USING FTS5("title", "body" prefix=2))"; But I would expect it as: expected = R"(USING FTS5("title", "body", prefix=2))";

Note the , between "body" and "prefix=2"

Maybe I'm misinterpreting the example/test though?

fnc12 commented 1 month ago

@trevornagy thanks for pointing out. This is my misunderstanding. I'll refactor it soon as a table constraint not column constraint

fnc12 commented 1 month ago

PR is here https://github.com/fnc12/sqlite_orm/pull/1284 @trevornagy btw you can still use prefix func at table level anyway

fnc12 commented 1 month ago

@trevornagy prefix fix is merged

trevornagy commented 1 month ago

Awesome, works as expected now. Thank you!

fnc12 commented 1 month ago

@trevornagy this is good! Only tokenize table constraint left to close this issue. I'll add it soon

trevornagy commented 1 month ago

Content was originally requested as well as tokenizer, but yes, amazing work. Really appreciated

fnc12 commented 1 month ago

@trevornagy sorry I forgot to mention content. Just my negligence, of course content will be implemented within this issue ASAP

fnc12 commented 1 month ago

@trevornagy I got a question: how are you going to use tokenize feature? Do you have exact SQL you expect to be used? It can help me with unit tests

trevornagy commented 1 month ago

Sure, here's an example of tokenize: tokenize='unicode61 remove_diacritics 1'

fnc12 commented 1 month ago

PR https://github.com/fnc12/sqlite_orm/pull/1292

fnc12 commented 1 month ago

@trevornagy please check tokenize feature in dev

fnc12 commented 1 month ago

PR with content is on its way https://github.com/fnc12/sqlite_orm/pull/1296

fnc12 commented 1 month ago

@trevornagy please check content function in dev branch

fnc12 commented 1 month ago

@trevornagy do you also need contentless_delete, content=tbl and content_rowid?

trevornagy commented 1 month ago

Will test soon, sorry. As for the others, let me verify works with what we currently have and I'll get back to you! Thanks for all the work so far

trevornagy commented 1 month ago

Hey @fnc12, my apologies for the delayed response. I think I'll need content='tbl', to replicate the following example below.

What I'm trying to accomplish is documented here: https://www.sqlite.org/fts5.html#external_content_tables

The example in sqlite would look something like:

-- Create a table. And an external content fts5 table to index it.
CREATE TABLE tbl(a INTEGER PRIMARY KEY, b, c);
CREATE VIRTUAL TABLE fts_idx USING fts5(b, c, content='tbl');

What's missing is being able to specify the external content b & c in this table. I personally do not need the content_rowid option as I just use the default.

Once again, sorry for the delayed response!

fnc12 commented 4 weeks ago

@trevornagy table content is on its way https://github.com/fnc12/sqlite_orm/pull/1304

fnc12 commented 3 weeks ago

hey @trevornagy . Please check out dev for new content<User>() function which you asked about. Also please say whether all required sub-features of FTS5 are implemented. Thanks

trevornagy commented 3 weeks ago

Perfect, thank you very much! That is all of the features! I really appreciate this!