lucaong / minisearch

Tiny and powerful JavaScript full-text search engine for browser and Node
https://lucaong.github.io/minisearch/
MIT License
4.64k stars 133 forks source link

how to index nested field with its value is an array #238

Closed jimbung closed 9 months ago

jimbung commented 9 months ago

my data is like this:

[{ 
    "id": 1,
    "access": "acc23 15698",
    "category": "Lab",
    "actions": [
        {
            "comment": "9/12/23 BY GSM.~ONE DAY~ACTION WILL BE REPORTED",
            "resultTime": 1694502000,
            "status": "F"
        },
        {
            "comment": "10/12/23 BY GSM.~TWO DAY~ACTION WILL BE REPORTED",
            "resultTime": 1697532400,
            "status": "P"
        }
    ],
    "subcat": "client"
}]

how to index "comment" in the array structure "actions"? and how if there is deeper structure like this (the "game" field in array "suscp"):

[{ 
    "id": 1,
    "access": "acc23 15698",
    "category": "Lab",
    "actions": [
        {
            "comment": "9/12/23 BY GSM.~ONE DAY~ACTION WILL BE REPORTED",
            "resultTime": 1694502000,
            "status": "F",
             "suscp": [{
                  "game":"the game starts on 15:00",
                  "wol": "W"
             }]
        },
        {
            "comment": "10/12/23 BY GSM.~TWO DAY~ACTION WILL BE REPORTED",
            "resultTime": 1697532400,
            "status": "P",
             "suscp": [{
                  "game":"the game starts on 13:20",
                  "wol": "L"
             }]
        }
    ],
    "subcat": "client"
}]

Thanks for any comments!

lucaong commented 9 months ago

Hi @jimbung , you can achieve what you want by specifying a custom extractField option. One question is how you want to aggregate the comment text from multiple elements of the array. If you simply want to concatenate them, here is how you could do that:

const miniSearch = new MiniSearch({
  // Specify the list of fields to index, as if 'comment' and 'game' were not nested
  fields: ['access', 'category', 'comment', 'game', 'subcat'],

  // Specify a custom logic to extract fields from the document
  extractField: (doc, fieldName) => {
    if (fieldName === 'comment') {
      return (doc.actions || []).map((action) => action.comment || '').join(' ')
    } else if (fieldName === 'game') {
      const suscp = (doc.actions || []).flatMap((action) => action.suscp || {})
      return suscp.map((s) => s.game || '').join(' ')
    } else {
      return doc[fieldName]
    }
  }
})

You can then search as if comment and game were normal "flat" fields.

I hope this solves your issue!

jimbung commented 9 months ago

Hi @lucaong, Der Code funktioniert wie ein Zauber. Vielen Dank!

jimbung commented 9 months ago

Hi @lucaong,

const miniSearch = new MiniSearch({
  // Specify the list of fields to index, as if 'comment' and 'game' were not nested
  fields: ['access', 'category', 'comment', 'game', 'subcat'],

  // Specify a custom logic to extract fields from the document
  extractField: (doc, fieldName) => {
    if (fieldName === 'comment') {
      return (doc.actions || []).map((action) => action.comment || '').join(' ')
    } else if (fieldName === 'game') {
      const suscp = (doc.actions || []).flatMap((action) => action.suscp || {})
      return suscp.map((s) => s.game || '').join(' ')
    } else {
      return doc[fieldName]
    }
  }
})

should "action.suscp || {}" be "action.suscp || []", since suscp is an array? I use "action.suscp || {}" and "action.suscp || []", both works.

lucaong commented 9 months ago

@jimbung in this case it won't matter, as we are doing a flat map: the version with a {} will return a single element as an empty object (so calling s.game later returns undefined and defaults to ''), while the version with an empty array will be flattened to nothing by flatMap. I do agree that the version with an array looks cleaner though.

jimbung commented 9 months ago

Hi @lucaong Thanks a lot for your detailed interpretation! All is clear now.

lucaong commented 9 months ago

You are welcome @jimbung ! I will close the issue, but feel free to comment further if more info is needed.