elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.53k stars 24.61k forks source link

Aliases in indices_boost not allowed in combination with Point in Time #97693

Open jord-nijhuis opened 1 year ago

jord-nijhuis commented 1 year ago

Elasticsearch Version

8.8.1

Installed Plugins

No response

Java Version

bundled

OS Version

Docker image: elasticsearch:8.8.1, Docker version: 24.0.2, host: MacOS Ventura (M2)

Problem Description

It seems it is not possible to use a PIT-ID in combination with an indices boost on an alias.

In my case, I have two aliases (documents[X] and sections[X]), and two aliases (documents and sections). If I then create a PIT based on these two aliases, and use the PIT in a search query in combination with an index boost for one of the two aliases, I get a 400.

This is a query with a PIT:

GET /_search
{
  "query": {
    "match": {
      "name": "1"
    }
  },
  "sort": [
    {
      "_score": {
        "order": "DESC"
      }
    },
    {
      "_shard_doc": {
        "order": "ASC"
      }
    }
  ],
  "indices_boost": [
    {
      "documents": 20
    }
  ],
  "pit": {
    "id": "i_vrAwIdc2VjdGlvbnNfMC4xLjBfMjAyMzA3MDQxNDIzMTIWbjR1RmR6NUxSWUs0VGluc1NiNFF0QQAWZU1oWUhhdHNSZFdfMGRqQkJQRV83dwAAAAAAAAAMshZNVjI3V0RiU1JSS1FBS3lxN19rTThnAB5kb2N1bWVudHNfMC4xLjBfMjAyMzA3MDQxNDIzMTEWUFhqNVVIWVBTdFdrQjhsYkJNX09CdwAWZU1oWUhhdHNSZFdfMGRqQkJQRV83dwAAAAAAAAAMsRZNVjI3V0RiU1JSS1FBS3lxN19rTThnAAIWUFhqNVVIWVBTdFdrQjhsYkJNX09CdwAAFm40dUZkejVMUllLNFRpbnNTYjRRdEEAAA==",
    "keep_alive": "60s"
  }
}

And this is the error response:

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "The provided expression [documents] matches an alias, specify the corresponding concrete indices instead."
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "The provided expression [documents] matches an alias, specify the corresponding concrete indices instead."
  },
  "status": 400
}

If I remove the parameters related to the PIT:

GET /_search
{
  "query": {
    "match": {
      "name": "1"
    }
  },
  "sort": [
    {
      "_score": {
        "order": "DESC"
      }
    }
  ],
  "indices_boost": [
    {
      "documents": 1
    }
  ]
}

Everything works normally.

It may very well be that I am using the API wrongly, but it seems to me that this is a bug (please correct me if I'm wrong)

Steps to Reproduce

  1. Create an index
  2. Create an alias
  3. Create a PIT for that alias
  4. Use that PIT in a search in combination with an indices_boost for that alias.

I've also created a small script that does this:

#!/usr/bin/env bash

# Delete previous index
echo -e "Delete previous index 'index0001'"
curl -X DELETE "localhost:9200/index0001"

# Create Index
echo -e "\n\nCreating index 'index0001'"
curl -XPUT localhost:9200/index0001

# Create Alias
echo -e "\n\nCreating alias 'index'"
curl -XPOST "localhost:9200/_aliases" -H 'Content-Type: application/json' -d'
{
  "actions": [
    {
      "add": {
        "index": "index0001",
        "alias": "index"
      }
    }
  ]
}
'

# Inserting a document
echo -e "\n\nInserting a document"
curl -XPOST "http://localhost:9200/index/_doc/1?refresh"  -H 'Content-Type: application/json' -d'
{
  "name": "1"
}
'

echo -e "\n\nRetrieving PIT ID"
# Get PIT
pit=$(curl -X POST "localhost:9200/index/_pit?keep_alive=1m" | jq -r '.id')

echo -e "PIT ID: $pit"

echo -e "\nSearching without PIT ID"
# Search without PIT
curl -XGET "localhost:9200/_search" -H "Content-Type: application/json" -d'
{
  "query": {
    "match": {
      "name": "1"
    }
  },
  "sort": [
    {
      "_score": {
        "order": "DESC"
      }
    }
  ],
  "indices_boost": [
    {
      "index": 20
    }
  ]
}'

echo -e "\n\nSearching with PIT ID"
# Search with PIT
curl -XGET "localhost:9200/_search" -H "Content-Type: application/json" -d'
{
  "query": {
    "match": {
      "name": "1"
    }
  },
  "sort": [
    {
      "_score": {
        "order": "DESC"
      }
    },
    {
      "_shard_doc": {
        "order": "ASC"
      }
    }
  ],
  "indices_boost": [
    {
      "index": 20
    }
  ],
  "pit": {
    "id": "'"$pit"'",
    "keep_alive": "60s"
  }
}'

echo -e "\n\nDone!"

When I run the script, I get the following output:

Delete previous index 'index0001'
{"acknowledged":true}

Creating index 'index0001'
{"acknowledged":true,"shards_acknowledged":true,"index":"index0001"}

Creating alias 'index'
{"acknowledged":true}

Inserting a document
{"_index":"index0001","_id":"1","_version":1,"result":"created","forced_refresh":true,"_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}

Retrieving PIT ID
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   173  100   173    0     0  43852      0 --:--:-- --:--:-- --:--:--  168k
PIT ID: x5btAwEJaW5kZXgwMDAxFkxEMkk1X1VJUTVXNXNhalBCNl9KUEEAFmVNaFlIYXRzUmRXXzBkakJCUEVfN3cAAAAAAAAABOgWTFphbVBDVEdROWVBeW5RaEpPQnRldwABFkxEMkk1X1VJUTVXNXNhalBCNl9KUEEAAA==

Searching without PIT ID
{"took":1,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":3.6464314,"hits":[{"_index":"index0001","_id":"1","_score":3.6464314,"_source":
{
  "name": "1"
}
}]}}

Searching with PIT ID
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The provided expression [index] matches an alias, specify the corresponding concrete indices instead."}],"type":"illegal_argument_exception","reason":"The provided expression [index] matches an alias, specify the corresponding concrete indices instead."},"status":400}

Done!

Logs (if relevant)

No response

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-search (Team:Search)

jord-nijhuis commented 1 year ago

I've done some digging in the code, and it appears if I set RestSearchAction.java:408 (the ignoreAliases-parameter) to false, my script seems to run:

  final IndicesOptions stricterIndicesOptions = IndicesOptions.fromOptions(
            indicesOptions.ignoreUnavailable(),
            indicesOptions.allowNoIndices(),
            false,
            false,
            false,
            true,          
-           true,
+           false // ignoreAliases
            indicesOptions.ignoreThrottled()
        );

Output:

Delete previous index 'index0001'
{"acknowledged":true}

Creating index 'index0001'
{"acknowledged":true,"shards_acknowledged":true,"index":"index0001"}

Creating alias 'index'
{"acknowledged":true}

Inserting a document
{"_index":"index0001","_id":"1","_version":2,"result":"updated","forced_refresh":true,"_shards":{"total":2,"successful":1,"failed":0},"_seq_no":1,"_primary_term":1}

Retrieving PIT ID
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   173  100   173    0     0  27876      0 --:--:-- --:--:-- --:--:-- 57666
PIT ID: x-aGBAEJaW5kZXgwMDAxFjNPWEU4dzVNU1lTQ0k2TF9VM1g0SmcAFkJQVW9HSVh0Ul9DVW5qWEVTcFdBeUEAAAAAAAAAABcWa1NVYTN6QnVTSGFfQlp5M0lwbmlZQQABFjNPWEU4dzVNU1lTQ0k2TF9VM1g0SmcAAA==

Searching without PIT ID
{"took":1,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":3.6464314,"hits":[{"_index":"index0001","_id":"1","_score":3.6464314,"_source":
{
  "name": "1"
}
}]}}

Searching with PIT ID
{"pit_id":"x-aGBAEJaW5kZXgwMDAxFjNPWEU4dzVNU1lTQ0k2TF9VM1g0SmcAFkJQVW9HSVh0Ul9DVW5qWEVTcFdBeUEAAAAAAAAAABcWa1NVYTN6QnVTSGFfQlp5M0lwbmlZQQABFjNPWEU4dzVNU1lTQ0k2TF9VM1g0SmcAAA==","took":0,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":1,"relation":"eq"},"max_score":null,"hits":[{"_index":"index0001","_id":"1","_score":3.6464314,"_source":
{
  "name": "1"
}
,"sort":[3.6464314,1]}]}}

Done!

That being said, I have no idea why the value was true before, and have no clue what the implications are by setting ignoreAliasses to false. And of course, it is also possible that there is a good reason why PIT and indices_boost cannot be used together (I am very much a novice when it comes to Elasticsearch), but the documentation did not mention this.

EDIT: I've updated the script to insert a document.

elasticsearchmachine commented 2 months ago

Pinging @elastic/es-search-foundations (Team:Search Foundations)

Arup-Chauhan commented 1 month ago

@javanna is anyone working on this issue? If not, I would like to take this up, thanks!

marstalk commented 1 month ago

Can't reproduce on the main branch.

vikasrajputin commented 2 weeks ago

Hey @javanna - can i pick this up?