nextcloud / fulltextsearch_elasticsearch

🔍 Use Elasticsearch to index the content of your Nextcloud
GNU Affero General Public License v3.0
80 stars 29 forks source link

http 500 on NC and 400 Bad Request to Elasticsearch when exceeding index.highlight.max_analyzed_offset #324

Open cue108 opened 11 months ago

cue108 commented 11 months ago

Error logging:

{\"type\":\"illegal_argument_exception\",\"reason\":\"The length [2049605] of field [content] in doc[724]\/index[nc_fulltextsearch] exceeds the [index.highlight.max_analyzed_offset] limit [1000000]. To avoid this error, set the query parameter [max_analyzed_offset] to a value less than index setting [1000000] and this will tolerate long field values by truncating them.\"}}},\"status\":400}"}

There are two ways to handle this:

  1. On the connector side, hence "fulltextsearch_elasticsearch" a simple determine the max content size:

    GET yourindexname/_search
    {
    "size": 0,
    "aggs": {
    "max_content_length": {
      "max": {
        "field": "attachment.content_length"
      }
    }
    }
    }

    and increase the max_analyzed_offset accordingly:

    PUT indexname/_settings
    {
    "index" : {
    "highlight.max_analyzed_offset" : 3000000
    }
    }

    Will mitigate this error already

  2. On the Elasticsearch side an index template will do:

    PUT /_index_template/indexname
    {
    "index_patterns" : ["indexname"],
    "priority" : 1,
    "template": {
    "settings" : {
      "highlight.max_analyzed_offset" : 3000000
    }
    }
    }
meonkeys commented 11 months ago

I'm also seeing that illegal_argument_exception error in my browser console for about half the XHR requests to https://mycloud.example.com/apps/fulltextsearch/v1/search.

I've got: