opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.43k stars 1.72k forks source link

[BUG] null_pointer_exception when requesting with search_after on a null field #14825

Open TatianaNeuer opened 1 month ago

TatianaNeuer commented 1 month ago

Describe the bug

OpenSearch returns a 500 Internal server error when search_after contains a null field and "track_total_hits:false". This is the error:

{
    "error": {
        "root_cause": [
            {
                "type": "null_pointer_exception",
                "reason": "Cannot read field \"bytes\" because \"other\" is null"
            }
        ],
        "type": "search_phase_execution_exception",
        "reason": "all shards failed",
        "phase": "query",
        "grouped": true,
        "failed_shards": [
            {
                "shard": 0,
                "index": "sample_index_null",
                "node": "l3SwxdKCRpWEexkAj0IVVA",
                "reason": {
                    "type": "null_pointer_exception",
                    "reason": "Cannot read field \"bytes\" because \"other\" is null"
                }
            }
        ],
        "caused_by": {
            "type": "null_pointer_exception",
            "reason": "Cannot read field \"bytes\" because \"other\" is null",
            "caused_by": {
                "type": "null_pointer_exception",
                "reason": "Cannot read field \"bytes\" because \"other\" is null"
            }
        }
    },
    "status": 500
}

Related component

Search

To Reproduce

  1. Index some documents with null fields: POST /_bulk
    
    { "index": { "_index": "sample_index_null", "_id": "1" } }
    { "doc": "doc1", "name": "bob"}
    { "index": { "_index": "sample_index_null", "_id": "2" } }
    { "doc": "doc2", "name": null}
    { "index": { "_index": "sample_index_null", "_id": "3" } }
    { "doc": "doc3", "name": null}
2. Search documents:

{ "size": 20, "track_total_hits": false, "sort": [ { "name.keyword": { "order": "desc" } }, { "doc.keyword": { "order": "asc" } } ], "search_after": [ null, "doc2" ] }


3. The response is a 500 Internal server error. The same request but with "track_total_hits:true" returns the correct document with no error.

### Expected behavior

OpenSearch should return the documents.

### Additional Details

**Host/Environment (please complete the following information):**
- OS: Windows 10 with WSL2 and docker
- Version : docker image: opensearchproject/opensearch:2.15.0
- 1 opensearch node run with the following docker compose file:

version: '3' services: opensearch: image: opensearchproject/opensearch:2.15.0 container_name: opensearch environment:

volumes: opensearch:

networks: opensearch-net:

dblock commented 1 month ago

Looks like a bug. Would you have a moment to try and write a YAML REST test for this?

https://github.com/opensearch-project/OpenSearch/blob/main/TESTING.md#testing-the-rest-layer

TatianaNeuer commented 1 month ago

I tried writing a YAML REST test, heavily inspired from https://github.com/opensearch-project/OpenSearch/blob/main/rest-api-spec/src/main/resources/rest-api-spec/test/search/90_search_after.yml, I did not run the test so I hope the syntax is correct:

"null values":
  - do:
      indices.create:
          index:  test
  - do:
      bulk:
        refresh: true
        index: test
        body: |
          {"index":{}}
          { "doc": "doc1", "name": "bob"}
          {"index":{}}
          { "doc": "doc2", "name": null}
          {"index":{}}
      { "doc": "doc3", "name": null}

  - do:
      search:
        rest_total_hits_as_int: true
        index: test
        body:
          size: 2
      track_total_hits: false
          sort: [{ name.keyword: desc }, { doc.keyword: desc }]

  - match: {hits.total: 3 }
  - length: {hits.hits: 2 }
  - match: {hits.hits.0._index: test }
  - match: {hits.hits.0._source.doc: doc1 }
  - match: {hits.hits.1._index: test }
  - match: {hits.hits.1._source.doc: doc2 }
  - match: {hits.hits.1.sort: [null, "doc2"] }

  - do:
      search:
        rest_total_hits_as_int: true
        index: test
        body:
          size: 1
      track_total_hits: false
          sort: [{ name.keyword: desc }, { doc.keyword: desc }]
      search_after: [null, "doc2"]

  - match: {hits.total: 1 }
  - length: {hits.hits: 1 }
  - match: {hits.hits.0._index: test }
  - match: {hits.hits.0._source.doc: doc3 }
  - match: {hits.hits.1.sort: [null, "doc3"] }
dblock commented 1 month ago

Try running it? :)