oliver006 / elasticsearch-test-data

Generate and upload test data to Elasticsearch for performance and load testing
MIT License
257 stars 124 forks source link

Index options not picked up #29

Open BeardyC opened 3 years ago

BeardyC commented 3 years ago

Hi,

I'm trying to override the index config defaults using the below:

docker run --rm -it --network host oliver006/es-test-data  --es_url=http://... --batch_size=1000 --num_of_shards=1 --num_of_replicas=2 --index_name=test_data4

However it seems to ignore the options for shards & replicas

curl http://.../test_data4/  |  jq
{
  "test_data4": {
    "aliases": {},
    "mappings": {
      "properties": {
        "age": {
          "type": "long"
        },
        "last_updated": {
          "type": "long"
        },
        "name": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    },
    "settings": {
      "index": {
        "creation_date": "1621856375498",
        "number_of_shards": "1",
        "number_of_replicas": "1",
        "uuid": "_LBis-bSTmOqmEIi2lnJNQ",
        "version": {
          "created": "7040199"
        },
        "provided_name": "test_data4"
      }
    }
  }
}

Can you confirm that this is working? Am I missing something?

oliver006 commented 3 years ago

Thanks for the question. There's a good chance that the ES options have changed over time, this repo probably hasn't caught up with how things are done these days.

kbiernat commented 3 years ago

It works fine on old ES (tried 5.5 first). On the new ES (7.2), with the index not being there, so it should create a new one, the output is (added printing the HTTP Error exception, hence HTTP 400):

[I 210603 11:33:30 es_test_data:55] Trying to create index http://localhost:9200/test_data
HTTP 400: Bad Request
[I 210603 11:33:30 es_test_data:61] Looks like the index exists already
tuapuikia commented 3 years ago

The refresh option is not available in ES 7 and the scheme require new index field.

diff --git a/es_test_data.py b/es_test_data.py
index 403d3f2..294ac20 100755
--- a/es_test_data.py
+++ b/es_test_data.py
@@ -43,10 +43,11 @@ def delete_index(idx_name):
 def create_index(idx_name):
     schema = {
         "settings": {
-            "number_of_shards":   tornado.options.options.num_of_shards,
-            "number_of_replicas": tornado.options.options.num_of_replicas
-        },
-        "refresh": True
+            "index": {
+                "number_of_shards":   tornado.options.options.num_of_shards,
+                "number_of_replicas": tornado.options.options.num_of_replicas
+            }
+        }
     }

     body = json.dumps(schema)
oliver006 commented 3 years ago

That's a great find @tuapuikia - could you open a PR?

ViggoC commented 3 years ago

@tuapuikia According to the reference of Create index API, the index field is optional for both 6.x and 7.x version.