elastic / elasticsearch-py

Official Python client for Elasticsearch
https://ela.st/es-python
Apache License 2.0
4.22k stars 1.18k forks source link

Deprecation warnings in 7.15.0 pre-releases #1698

Closed sethmlarson closed 2 years ago

sethmlarson commented 3 years ago

If you're seeing this you likely received a deprecation warning from a 7.15.0 pre-release. Thanks for trying out in-development software

The v8.0.0 roadmap includes a list of breaking changes that will be implemented to make the Python Elasticsearch client easier to use and more discoverable. To make the upgrade from the 7.x client to the 8.0.0 client as smooth as possible for developers we're deprecating options and usages that will be removed in either 8.0 or 9.0.

This also means that you'll get an early preview for the great things to come in the 8.0.0 client starting in 7.x, which we're pretty excited about!

Which APIs are effected?

All APIs will emit deprecation warnings for positional argument use in 7.15.0.

The following APIs will start emitting deprecation warnings regarding the body parameters. This list may change leading up to the 7.15.0 release.

The following APIs will start emitting deprecation warnings regarding doc_type parameters.

What is being deprecated?

Starting in 7.15.0 the following features will be deprecated and are scheduled for removal in 9.0.0:

Positional arguments for APIs are deprecated

Using keyword arguments has always been recommended when using the client but now starting in 7.15.0 using any positional arguments will emit a deprecation warning.

# ✅ Supported usage:
es.search(index="index", ...)

# ❌ Deprecated usage:
es.search("index", ...)

Update: the body parameter for APIs is no longer deprecated as of elasticsearch-py 8.12

For JSON requests each field within the top-level JSON object will become it's own parameter of the API with full type hinting

# ✅ New usage:
es.search(query={...})

# ✅ Accepted usage:
es.search(body={"query": {...}})

body can also be bytes as long as they're valid JSON, which can be useful as an optimization to avoid serialization by elasticsearch-py.

You should not mix parameters with body:

# ❌ Deprecated usage:
es.search(body={"query": {...}}, from_=10)
# ✅ Correct usages:
es.search(query={...}, from_=10)
es.search(body={"query": {...}, "from": 10})

Or use aliased names in body:

# ❌ Deprecated usage:
es.search(body={"query": {...}, "from_": 10})
# ✅ Correct usage:
es.search(body={"query": {...}, "from": 10})

For non-JSON requests or requests where the JSON body is itself an addressable object (like a document) each API will have the parameter renamed to a more semantic name:

# ✅ New usage:
es.index(document={...})

# ✅ Accepted usage:
es.index(body={...})

However, you can't use both document and body.

The doc_type parameter for non-Index APIs

Using doc_type for APIs that aren't related to indices or search is deprecated. Instead you should use the type parameter. See https://github.com/elastic/elasticsearch-py/pull/1713 for more context for this change.

For APIs that are related to indices or search the doc_type parameter isn't deprecated client-side, however mapping types are deprecated in Elasticsearch and will be removed in 8.0.

# ✅ New usage:
es.nodes.hot_threads(type="cpu")

# ❌ Deprecated usage:
es.nodes.hot_threads(doc_type="cpu")
sethmlarson commented 2 years ago

Updated this document with the body parameter's new schedule for removal in 9.0 instead of 8.0.

joshdevins commented 2 years ago

For index create, the deprecation of body is actually quite inconvenient. It's pretty frequently the case that a config file used to create an index just contains everything needed. E.g.

{
  "settings": {
    "index": {
      "number_of_shards": 1,
      "number_of_replicas": 0
    }
  },
  "mappings": {
    "dynamic": "strict",
    "properties": {
      "id": {
        "ignore_above": 1024,
        "type": "keyword"
      },
      "body": {
        "type": "text",
        "analyzer": "english"
      }
    }
  }
}

This allows me to have a single file per index and just load the config file straight into create.

es.indices.create(index=index, body=load_json(config_fname))

With this change, I need to break that config file up and use a file per top-level attribute, or split the body into its' constituent parts after loading it, just to pass it into create.

config = load_json(config_fname)
es.indices.create(index=index, settings=config["settings"], mappings=config["mappings"])

This seems like an unwanted side-effect for create at least, and I wonder if we should leave body in create.

sethmlarson commented 2 years ago

@joshdevins The case of "make a JSON body an API request" is likely to be applicable elsewhere too (ingest.put_pipeline, transform.put_transform, search?) A raw JSON HTTP body is a transport-layer concept compared to "this is the index's settings" and "this is the index's mapping" being API-layer concepts.

The goal of moving away from params and body is for the client to become more than "named URLs": being able to catch issues with your request before it's sent to Elasticsearch (required parameters, parameter types, using API structures and enums) which means encoding Elasticsearch API information into function signatures.

I know usage of body is pervasive so I'm keeping the option to use body and params for almost all APIs in 8.x (search_template uses the params keyword) and when using these options the individual values will be picked out and expanded so they go through the same processing as if you'd passed the parameters directly:

client.indices.create(index="...", body={"settings": ..., "mappings": ...})
# in 8.0 will effectively be the same as:
client.indices.create(index="...", settings=..., mappings=...)

The downside of above is a DeprecationWarning will be issued for using body=....

If you have a JSON body and the API encodes parameters into the body you can also do:

client.indices.create(index="...", **raw_json)

This is effectively what the rewriting of body does for you, except here you're doing it yourself and avoiding a DeprecationWarning. By doing this you're basically blessing the JSON that you have as describing a request to Elasticsearch rather than a raw JSON body (since you can add other parameters to the JSON too like wait_for_active_shards, etc).

What are your thoughts on this?

joshdevins commented 2 years ago

Hey @sethmlarson thanks for the detailed response! I think this makes sense and I'm glad we can document the discussion here as well. The splat [1] is probably the best advice for going forward in the case where I have a whole config file ready to go 👍

[1]

client.indices.create(index="...", **raw_json)
nkadochn commented 2 years ago

I am struggling to understand how to convert bool and agg queries to the new API format.

Bool example:

body = {
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "title": "city"
                    }
                },
                {
                    "match": {
                        "title": "chicago"
                    }
                }
            ]
        }
    },
       "sort": {
        "crawled_date": {
            "order": "desc"
        }
    }
}

res = es.search(index=es_index_name, body=body)

Agg example:

body = {
    "size": 0,
    "aggs": {
        "my_agg_name": {
            "terms": {
            "size": 100,
                "field": "language"
            }
        }
    }
}

res = es.search(index=es_index_name, body=body)

Get indexes example:

for index in es.indices.get('*'):
    if index[0:1] != '.':
        print(index)
buratinopy commented 2 years ago

I am struggling to understand how to convert bool and agg queries to the new API format.

Bool example:

body = {
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "title": "city"
                    }
                },
                {
                    "match": {
                        "title": "chicago"
                    }
                }
            ]
        }
    },
       "sort": {
        "crawled_date": {
            "order": "desc"
        }
    }
}

res = es.search(index=es_index_name, body=body)

Agg example:

body = {
    "size": 0,
    "aggs": {
        "my_agg_name": {
            "terms": {
            "size": 100,
                "field": "language"
            }
        }
    }
}

res = es.search(index=es_index_name, body=body)

Get indexes example:

for index in es.indices.get('*'):
    if index[0:1] != '.':
        print(index)

https://elasticsearch-py.readthedocs.io/en/v8.0.0a1/api.html#elasticsearch.Elasticsearch.search

sethmlarson commented 2 years ago

@nkadochn These examples should work:

es.search(
  index=es_index_name,
  query={"bool": {...}},
  sort={"crawled_date": {"order": "desc"}}
)
es.search(
  index=es_index_name,
  aggs={...},
  size=0
)
for index in es.indices.get(index="*"):
    ...
schettino72 commented 2 years ago

I am not able to use new API on mget when passing parameter ids

TypeError: mget() got an unexpected keyword argument 'ids'
sethmlarson commented 2 years ago

@schettino72 Can you open a separate issue including your elasticsearch-py package version?

schettino72 commented 2 years ago

@schettino72 Can you open a separate issue including your elasticsearch-py package version?

1856

Hoodwinker13 commented 2 years ago

I think I am missing something from what I have read. This is my current code, but it says I get an error at my print statement, and I am not sure how to fix it. params = { 'index' : index_name, 'body' : { 'settings' : { 'number_of_shards' : 5, }, 'mappings' : { 'mask_completion' : { 'properties' : { 'name' : {'type':'completion'}, } }, }, }, }

    `if self.es.indices.exists(index=index_name):
        self.es.indices.delete(index=index_name)
    print(self.es.indices.create(**params))`
kristall commented 2 years ago

I wonder why analyze will still accept the body param then?

kristall commented 2 years ago

@Hoodwinker13 your params still contain the body key ... try this:

params = {'settings': {'number_of_shards' : 5}, 'mappings': {'mask_completion': {'properties': {'name': {'type':'completion'},}},},},} and print(self.es.indices.create(index='index_name', **params))

Hoodwinker13 commented 2 years ago

@kristall Here is the code that I have fixed. But I am still getting an error. params = { 'settings': { 'number_of_shards' : 5 }, 'mappings': { 'mask_completion': { 'properties': { 'name': {'type':'completion'}, } }, }, } if self.es.indices.exists(index=index_name): self.es.indices.delete(index=index_name) print(self.es.indices.create(index = index_name, **params))

This is the error that I get. code_error

sethmlarson commented 2 years ago

@Hoodwinker13 Could you copy and paste the complete traceback as text into a GitHub comment? Can't see the full error in the picture.

kristall commented 2 years ago

Just got up, and not yet thinking straight, so I deleted my comments, sry.

kristall commented 2 years ago

is index_name a string? if not that one must be in qoutes...

print(self.es.indices.create(index='index_name', **params))

Hoodwinker13 commented 2 years ago

index_name is a string. I have tried print(self.es.indices.create('index' = index_name, **params)) but that gives me an error in the 'index' part.

Hoodwinker13 commented 2 years ago

@sethmlarson This is the whole picture of the error that I receive. code error2

kristall commented 2 years ago

yes, using ' around the paramname was a mistake of mine. sry for that.

kristall commented 2 years ago

@Hoodwinker13 And from your posted trackback it seems there is something wrong with your mapping: Root mapping definition has unsupported parameters

I'd build a simple working base and then add parameters until it breaks, so you know which...

kristall commented 2 years ago

I wonder why analyze will still accept the body param then?

@sethmlarson I'd still be happy about an answer.

And after an initial, "grrr more work", I actually like the changes a lot.

sethmlarson commented 2 years ago

@kristall From your screenshots it looks like you're using a document type (mask_data?) these are deprecated in 7.x and removed in 8.0 of Elasticsearch. Try structuring your mapping like so:

mapping={
  "properties": {
    "error_10": {"type": "keyword"},
    ...
  }
}
sethmlarson commented 2 years ago

Going to close and lock this issue as 8.0 is now released. Any issues should be opened separately.