manticoresoftware / manticoresearch

Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
https://manticoresearch.com
GNU General Public License v3.0
9.04k stars 507 forks source link

Index attributes in an inverted index #2449

Open sanikolaev opened 3 months ago

sanikolaev commented 3 months ago

Proposal:

Currently, when you add a document like:

{
"int_attribute": 123,
"float_attribute": 1.23,
...
"ft_field" "abc"
}

only abc goes into the inverted index, and you can find it using match('abc'). You can't find this document with match('123') or match('1.23').

It would be cool if Manticore could do it.

As discussed on the dev call of Jul 26, 2024, what we can do is:

The other issues we'd have to think through are:

Related forum topic: https://forum.manticoresearch.com/t/querying-json-fields-for-a-value-regardless-of-key/2029?u=sergey

Checklist:

To be completed by the assignee. Check off tasks that have been completed or are not applicable.

- [ ] Implementation completed - [ ] Tests developed - [ ] Documentation updated - [ ] Documentation reviewed - [ ] Changelog updated - [x] OpenAPI YAML updated and issue created to rebuild clients
sanikolaev commented 3 months ago

What we can start with is benchmarking emulating what it would look like if attributes were in the inverted index. For that we can take a random table and copy the attribute values to a full-text field and then check how it affects the table file sizes.