oramasearch / orama

🌌 A complete search engine and RAG pipeline in your browser, server or edge network with support for full-text, vector, and hybrid search in less than 2kb.
https://docs.orama.com
Other
8.76k stars 297 forks source link

Typo tolerance prefix search not working as expected #797

Closed naira-petrosyan-m closed 1 month ago

naira-petrosyan-m commented 2 months ago

Describe the bug

I have this docs:

[{
    name: "S"
}, 
{
    name: "Scroll"
}]

When I search for "scrol" with typo tolerance of 1, I expect to have only one result "Scroll", instead I see both docs returned as results.

To Reproduce

  1. Init orama with schema {name: string}
  2. insert docs
    [{
    name: "S"
    }, 
    {
    name: "Scroll"
    }]
  3. search for term "scrol", put the typo tolerance to be 2
    search({
    term: "scrol",
    tolerance: 2,
    })

Expected behavior

Get only one result "Scroll" instead of 2 results

Environment Info

OS: MacOS
Node: 18.19.0
Orama: 2.1.1

Affected areas

Search

Additional context

No response

naira-petrosyan-m commented 2 months ago

Additionally when I have this doc

{name: "customer.ionic"}

and I do search like this

search({
    term: "customerionic",
    tolerance: 2,
})

I expect to have my doc returned, but it does get returned

allevo commented 1 month ago

We plan this fix with next release (tomorrow).

Thanks for reporting this issue.

For the second question, Orama uses a tokenizer that split the string into token using space and punctuation. So "customer.ionic" is split into ["customer", "ionic"].

You can customize it, reading the docs