lnx-search / lnx

⚡ Insanely fast, 🌟 Feature-rich searching. lnx is the adaptable, typo tollerant deployment of the tantivy search engine.
https://lnx.rs
MIT License
1.25k stars 46 forks source link

Fuzzy field-based search with multiple terms #85

Closed fliepeltje closed 2 years ago

fliepeltje commented 2 years ago

Reading through the docs and the source code it seems like you can specify which fields you can search a specific term for, so you can issue a query like:

{
    "query": [
         {"term": {"ctx": "Harry Potter", "fields": ["role"]}, "occur": "must"},
         {"term": {"ctx": "Daniel Radcliffe", "fields": ["actor"]}, "occur": "must"}
     ]
}

It would be really neat if it were possible to do the same for fuzzy queries so that something like this would be possible:

{
    "query": [
         {"fuzzy": {"ctx": "Barry Potter", "fields": ["role"]}, "occur": "must"},
         {"fuzzy": {"ctx": "Daniel Radclif", "fields": ["actor"]}, "occur": "must"}
     ]
}

To put this in a little more perspective in terms of a use case, suppose documents of movies, actors, and roles. I might have heard that the lead character of a movie is called Harry Potter but have no idea what movie this character belongs in, but I do want to know who the actor is.

If I were to create an index with the fuzzy method, I could create an index across all 3 fields, but when I search for Harry Potter I will get a bunch of results of actors on account of the movie being called Harry potter and the ...

Alternatively I could create separate indexes for each of these and search the individual index, but then I run into the problem that once I do have more information (like say movie name or actor name), I would have to compute a likelihood score myself from results of searching multiple indexes.

ChillFish8 commented 2 years ago

This is a good idea, I'll probably mark this as something to target for 0.10 as I want to stabilize stuff for the 0.9 release.

I plan on making the query system a bit more configurable and more intuitive but right now the current query system makes it a little bit limiting to do without creating a mess in the code base.

ChillFish8 commented 2 years ago

I forgot to close this when merging the PR, but this has been added to master now 👍