nextapps-de / flexsearch

Next-Generation full text search library for Browser and Node.js
Apache License 2.0
12.33k stars 489 forks source link

Some documents may appear multiple times in the search result #437

Open mikecat opened 5 months ago

mikecat commented 5 months ago

Environment:

Example JavaScript code test-dupe.js:

const { Document } = require("flexsearch");

const index = new Document({
  encode: (str) => str.split(" "),
  document: {
    id: "id",
    index: "data[]",
  },
});

index.add({ id: 0, data: ["test", "test hoge"] });
index.add({ id: 1, data: ["test", "hoge fuga test"] });
index.add({ id: 2, data: ["test", "hoge fuga foo"] });
index.add({ id: 3, data: ["bar", "test hoge"] });
index.add({ id: 4, data: ["meow", "hoge fuga test"] });

const res = index.search("test", { index: "data[]" });
console.log(res);

Running this code as node test-dupe.js resulted in:

[ { field: 'data[]', result: [ 0, 1, 2, 3, 1, 4 ] } ]

Now you can see the id 1 appears twice in the search result.

I expect that every documents should appear at most once in one search result.

This result also looks weird for me because the id 0 is not appearing twice while the document has test in two elements of data as the document with the id 1 does.

Documents with id 2, 3, and 4 are for verifying that each elements alone bring hits in the search.

My questions:

baterrey commented 1 week ago

Hello, any updates on it?