lucaong / minisearch

Tiny and powerful JavaScript full-text search engine for browser and Node
https://lucaong.github.io/minisearch/
MIT License
4.9k stars 137 forks source link

A simple query is not returning any results #267

Closed amir20 closed 4 months ago

amir20 commented 4 months ago

I have a few dozen containers that I am trying to search. I am searching by name and host fields.

import MiniSearch from "minisearch";
const documents = [
  {
    id: "2ccd0d8d390d",
    created: "2024-07-07T21:00:20.000Z",
    name: "clashleaders_worker.3",
    state: "running",
    host: "clashleaders.com",
    type: "container",
  },
  {
    id: "a5add677febc",
    created: "2024-07-07T19:00:56.000Z",
    name: "clashleaders_cron.1",
    state: "running",
    host: "clashleaders.com",
    type: "container",
  },
  {
    id: "2e57e7177434",
    created: "2024-07-07T19:00:56.000Z",
    name: "clashleaders_mongo.1",
    state: "running",
    host: "clashleaders.com",
    type: "container",
  },
  {
    id: "400fd1cce19b",
    created: "2024-07-07T19:00:57.000Z",
    name: "clashleaders_web.1",
    state: "running",
    host: "clashleaders.com",
    type: "container",
  },
  {
    id: "81e4ebe41c69",
    created: "2024-07-07T19:00:57.000Z",
    name: "clashleaders_imgproxy.1",
    state: "running",
    host: "clashleaders.com",
    type: "container",
  },
  {
    id: "f8e46128f172",
    created: "2024-07-07T20:35:11.000Z",
    name: "clashleaders_rq_calculation_worker.1",
    state: "running",
    host: "clashleaders.com",
    type: "container",
  },
  {
    id: "d5ebaf1db23e",
    created: "2024-07-07T21:03:33.000Z",
    name: "clashleaders_worker.2",
    state: "running",
    host: "clashleaders.com",
    type: "container",
  },
  {
    id: "503c3920dc23",
    created: "2024-07-07T19:00:57.000Z",
    name: "clashleaders_imgproxy-cache.1",
    state: "running",
    host: "clashleaders.com",
    type: "container",
  },
  {
    id: "a5c70018e31c",
    created: "2024-07-07T19:00:56.000Z",
    name: "clashleaders_redis.1",
    state: "running",
    host: "clashleaders.com",
    type: "container",
  },
  {
    id: "1bd89fe6ea46",
    created: "2024-07-07T19:00:56.000Z",
    name: "dozzle_dozzle.1",
    state: "running",
    host: "clashleaders.com",
    type: "container",
  },
  {
    id: "709a529ef23c",
    created: "2024-07-07T20:46:49.000Z",
    name: "clashleaders_rq_war_worker.1",
    state: "running",
    host: "clashleaders.com",
    type: "container",
  },
  {
    id: "07230ba74fc7",
    created: "2024-07-07T19:00:56.000Z",
    name: "traefik_traefik.1",
    state: "running",
    host: "clashleaders.com",
    type: "container",
  },
  {
    id: "59327e4f06b1",
    created: "2024-07-05T20:37:14.000Z",
    name: "funny_dirac",
    state: "running",
    host: "clashleaders.com",
    type: "container",
  },
  {
    id: "9174709bfdfd",
    created: "2024-07-07T21:07:46.000Z",
    name: "clashleaders_worker.1",
    state: "running",
    host: "clashleaders.com",
    type: "container",
  },
  {
    id: "clashleaders_worker",
    created: "2024-07-07T21:07:46.000Z",
    name: "clashleaders_worker",
    state: "running",
    type: "service",
  },
  {
    id: "clashleaders_cron",
    created: "2024-07-07T19:00:56.000Z",
    name: "clashleaders_cron",
    state: "running",
    type: "service",
  },
  {
    id: "clashleaders_mongo",
    created: "2024-07-07T19:00:56.000Z",
    name: "clashleaders_mongo",
    state: "running",
    type: "service",
  },
  {
    id: "clashleaders_web",
    created: "2024-07-07T19:00:57.000Z",
    name: "clashleaders_web",
    state: "running",
    type: "service",
  },
  {
    id: "clashleaders_imgproxy",
    created: "2024-07-07T19:00:57.000Z",
    name: "clashleaders_imgproxy",
    state: "running",
    type: "service",
  },
  {
    id: "clashleaders_rq_calculation_worker",
    created: "2024-07-07T20:35:11.000Z",
    name: "clashleaders_rq_calculation_worker",
    state: "running",
    type: "service",
  },
];

const index = new MiniSearch({
  fields: ["name", "host"],
  storeFields: ["name", "host"],
});

index.addAll(documents);

console.log(index.search("work"));

I would expect this to return all documents that have the word work but it returns [].

My package.json

{
  "name": "minitest",
  "version": "1.0.0",
  "type": "module",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "ISC",
  "description": "",
  "dependencies": {
    "minisearch": "^6.3.0"
  }
}
lucaong commented 4 months ago

Hi @amir20 , MiniSearch provides exact match, prefix match, and fuzzy match. In your case, to match work in worker, you want to use prefix match, which is enabled by the prefix search option:

index.search('work', { prefix: true })

Which correctly returns:

[
  {
    id: 'clashleaders_worker',
    score: 0.5736936071097007,
    terms: [ 'worker' ],
    queryTerms: [ 'work' ],
    match: { worker: [Array] },
    name: 'clashleaders_worker'
  },
  {
    id: '2ccd0d8d390d',
    score: 0.5219485808434806,
    terms: [ 'worker' ],
    queryTerms: [ 'work' ],
    match: { worker: [Array] },
    name: 'clashleaders_worker.3',
    host: 'clashleaders.com'
  },
  {
    id: 'd5ebaf1db23e',
    score: 0.5219485808434806,
    terms: [ 'worker' ],
    queryTerms: [ 'work' ],
    match: { worker: [Array] },
    name: 'clashleaders_worker.2',
    host: 'clashleaders.com'
  },
  {
    id: '9174709bfdfd',
    score: 0.5219485808434806,
    terms: [ 'worker' ],
    queryTerms: [ 'work' ],
    match: { worker: [Array] },
    name: 'clashleaders_worker.1',
    host: 'clashleaders.com'
  },
  {
    id: 'clashleaders_rq_calculation_worker',
    score: 0.4821054773767198,
    terms: [ 'worker' ],
    queryTerms: [ 'work' ],
    match: { worker: [Array] },
    name: 'clashleaders_rq_calculation_worker'
  },
  {
    id: 'f8e46128f172',
    score: 0.45048148169779756,
    terms: [ 'worker' ],
    queryTerms: [ 'work' ],
    match: { worker: [Array] },
    name: 'clashleaders_rq_calculation_worker.1',
    host: 'clashleaders.com'
  },
  {
    id: '709a529ef23c',
    score: 0.45048148169779756,
    terms: [ 'worker' ],
    queryTerms: [ 'work' ],
    match: { worker: [Array] },
    name: 'clashleaders_rq_war_worker.1',
    host: 'clashleaders.com'
  }
]

Note that you still won't be able to match at arbitrary positions inside a term, like leaders in clashleaders. If you need that too, one possibility is to do what explained here: https://github.com/lucaong/minisearch/issues/194#issuecomment-1369229601

It basically amounts to indexing all suffixes of a field, then applying prefix search. This will result in a larger index, but will solve your use case in a performant way, and it is a reasonable solution for short fields.

I hope this helps

amir20 commented 4 months ago

Missed the response. Thanks.

That makes sense. I was coming from fuse.js. I don't think this was super clear to me reading the documentation.