Open sanikolaev opened 3 months ago
What we can start with is benchmarking emulating what it would look like if attributes were in the inverted index. For that we can take a random table and copy the attribute values to a full-text field and then check how it affects the table file sizes.
Proposal:
Currently, when you add a document like:
only
abc
goes into the inverted index, and you can find it usingmatch('abc')
. You can't find this document withmatch('123')
ormatch('1.23')
.It would be cool if Manticore could do it.
As discussed on the dev call of Jul 26, 2024, what we can do is:
MAGIC_@attr_value
, similar to how we indexexact_term
asMAGIC_=token_value
.MAGIC
and will not apply stemming, will not generate infixes, will not create hitlists, or will not split tokens, but use the whole field as a value - only dictionary entries and doclists will be added into index.[regular_tokens, …, exact_tokens, …]
, but it will be[regular_tokens, …, exact_tokens, …, attributes, …]
.testing 2"
=>"test|=testing 2|=2"
, but with the new feature, it will be"testing 2"
=>"test|=testing|@testing 2|=2|@2"
.query_string
without any filters, but for SphinxQL, it's not clear.The other issues we'd have to think through are:
Related forum topic: https://forum.manticoresearch.com/t/querying-json-fields-for-a-value-regardless-of-key/2029?u=sergey
Checklist:
To be completed by the assignee. Check off tasks that have been completed or are not applicable.