Score exact matches higher over location

boarnoah commented 8 years ago

From what I see bitapScore weights matches higher the closer to the "location" given when creating a search over exact matches. EX: Query "egg" on a list that contains: "----- leggings" and "--------- -------- eggs" would rank the "leggings" higher than the "eggs" item because the "location: 0". Due to the nature of my items I can't guarantee where the keywords would be in the object names, even as an approximation.

Any thoughts into how I would modify the current bitap scoring to ignore/minimize the effect of location scoring over matches (is this even possible, I might be misunderstanding the whole algorithm).

One other way I can see would be to possibly randomize of keywords in my items/split them at " ", however this would cause bigram matches (ex: "chocolate milk") to rank lower.

boarnoah commented 8 years ago

I've attached a test case that runs with Node.js and has some example data attached to it. Running queries for it for the terms "egg" should illustrate what I mean.

Granted if you do search "eggs" it puts more relevant results to the top, but it still gives a lower score towards items with "egg" because of location over partial matches. Hope this helps, thanks.

The data might be a bit weird, but it's from actual production data I'm working with which I have extracted here for demonstration purposes.

FuseExample_78.zip

krisk commented 8 years ago

Thanks for test cases, I will take a look!

boarnoah commented 8 years ago

Not sure if this is more related to #77 , but if for example you change data to:

[
  {
    "name": "Eggs"
  }
]

For query: "egg" Fuse 2.0.X will not return the Eggs item, 1.3.1 will however.

Sorry for not being much help other than raising issues, I'm still working on understanding the algorithm.

einarlove commented 8 years ago

I was experiencing the same issue and reverted back to 1.3.1

krisk / Fuse

Score exact matches higher over location #78