elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.22k stars 24.85k forks source link

Yaml Integration tests use deprecated paths #45636

Open brusic opened 5 years ago

brusic commented 5 years ago

This issue revolves around the actual code against the master branch and not using Elasticsearch in general.

While working on a PR, I kept on experiencing errors during integration tests with tests involving action paths that were deprecated as part of the PR.

Here is an example of a deprecated path

{
  "get_script":{
    "url":{
      "paths":[
        {
          "path":"/_script/{id}",
          ...
        },
        {
          "path":"/_scripts/{id}",
          ...
          "deprecated":{
            "version":"8.0.0",
            "description":"The path is now singular"
          }
        }
      ]
    }
  }
}

Using LangPainlessClientYamlTestSuiteIT/painless/16_update2 as example. The test returns:

Failure at [painless/16_update2:12]: got unexpected warning header [
  299 Elasticsearch-8.0.0-SNAPSHOT-7610ef3242c5e63f8bd7545be5e2addb0feac688 "[GET /_scripts/{name}] is deprecated! Use [GET /_script/{id}] instead."
]

Removing the deprecated path in the json definition will fix the issue. Took me a while to figure out the issue. :) The deprecation definitions are useful and I would like to submit complete PRs with those definitions. Might be something I could fix, but Elastic PRs are never quick and I would like to finish the original PR first.

The JSON schema was updated today, but this issue occurred even before the change.

brusic commented 5 years ago

Perhaps not related, but the RestController is just screwy under integration tests.

"Create things in the cluster state that we'll validate are there after the upgrade":
  - do:
      put_script:
        id: test_search_template
        body:
          script:
            lang: mustache
            source:
              query:
                match:
                  f1: "{{f1}}"

results in:

WARNING: request [POST http://[::1]:55440/_script/test_search_template?error_trace=true] returned 1 warnings: [299 Elasticsearch-7.4.0-SNAPSHOT-a635eca5f83a7bbddc87f3640ad827294aabeeb0 "[types removal] Specifying types in document index requests is deprecated, use the typeless endpoints instead (/{index}/_doc/{id}, /{index}/_doc, or /{index}/_create/{id})."]

InvalidIndexNameException[Invalid index name [_script], must not start with '_', '-', or '+']

No issues with those endpoints running the server.

javanna commented 5 years ago

cc @elastic/es-clients

Howdy @brusic , what do you suggest we do about this problem you encountered with deprecated paths? Deprecated paths are still there and should be tested, what gets annoying is that we return deprecation warnings and in our tests we assert that there are no warnings, hence you need to kind of catch the warnings in your test. Would that work?

elasticmachine commented 5 years ago

Pinging @elastic/es-core-infra

brusic commented 5 years ago

I am in complete agreement that even deprecated paths should be tested, which is why I added the deprecation in the definition.

hence you need to kind of catch the warnings in your test

These are not my tests, but the existing tests in the elasticsearch codebase. The example above uses LangPainlessClientYamlTestSuiteIT. Any existing spec that puts a script will fail

  - do:
      put_script:
        id: "1"
        body: { "script": {"lang": "painless", "source": "_score * doc['myParent.weight'].value" } }
- match: { acknowledged: true }

I have not looked into the testing code or the way the REST endpoints are accumulated. When the specs look up the correct path for the appropriate action such as put_script, it seems multiple can be returned using the arguments provided. For id and body, it can be either /_script/{id} or /_scripts/{id}. If two or more are defined, drop the deprecated ones. If only one is defined and is deprecated, continue and catch the warning farther down, as it is occurring now.

brusic commented 5 years ago

Said I was not going to write a fix, but I'm doing it. The issue is mostly here

ClientYamlSuiteRestApi.Path path = RandomizedTest.randomFrom(bestPaths);

The best path should be among the non-deprecated ones. The integration tests are not attempting every path, just the "best" random one.

javanna commented 5 years ago

The integration tests are randomizing between all of the matching paths, based on the parameters that are provided. If we remove the deprecated ones from there, we end up not testing deprecated paths, it's like if they are not there?

brusic commented 5 years ago

Shouldn't there be isolated tests against the paths? In this case, the paths are not the focus of the tests, but the painless scripts are.

A better question would be: how can we can deprecate a path without breaking tests?

brusic commented 5 years ago

Will not create a PR since it sounds like the use of deprecated paths is wanted, but this makes the tests work

https://github.com/elastic/elasticsearch/compare/master...brusic:fix/best-non-deprecated-path?expand=1

Perhaps an allowed deprecated context can be added to the YAML Test client. Not going to work on this issue more for now.

brusic commented 5 years ago

At first I thought the issue was failing on tests whose code did not change. Since the GET/SCRIPT endpoints changed, they are used by multiple tests. But how come the removal of types do not trigger errors. It is because they are not marked as deprecated! In the original PR, I am fine with marking things as deprecated in the documentation and Javadoc, but leave the endpoints alone.

jaymode commented 4 years ago

The Core/Infra team discussed this issue in a team meeting. If the test client picks a deprecated path, then we need to ensure that the deprecation warning is asserted in the test. This validates that the path is actually deprecated and returning a warning while preventing developers from having to deal with issues like this one.

williamrandolph commented 4 years ago

I've taken another run at this issue and I think it needs another round of discussion. Let me lay out what I've learned.

Current Situation

In the YAML REST API tests, there may be multiple valid paths for a particular API. Most often, these multiple paths have different path parameters (or path "parts"): for example, /_nodes/hot_threads and /_nodes/{node_id}/hot_threads. The YAML test will choose a path based on how many path parameters the test provides and whether the names match. If multiple paths for an API have the same number of path parameters and same names for those parameters, then a path will be randomly chosen. Our YAML syntax doesn't offer a way to override the path for the API.

The troublesome edge case appears when an API has two paths with the same number of path parameters and the same names for those parameters, but one path is deprecated and the other one isn't. The Hot Threads API spec currently has an example of this problem on the 7.x branch. The non-deprecated /_nodes/hot_threads endpoint has no parameters, so it matches three deprecated endpoints:

This doesn't cause any test failures because there is no YAML rest test for this endpoint. This is probably because the hot threads endpoint returns plain text and not JSON, so the REST YAML setup can't do anything to verify the output. However, we can add a minimal test that triggers the issue with the current API spec definition:

"hot_threads test":
  - do:
      nodes.hot_threads: {}

Seventy-five percent of the time, this "test" will choose a deprecated endpoint and fail because of an unexpected warning.

There is a workaround if the developer doesn't want to change the rest API specification. Since #53139, a developer can add an allowed_warnings block to the test. A reliably passing test using this workaround would look like this:

setup:
  - skip:
      features: ["allowed_warnings"]
---
"hot_threads test":
  - do:
      nodes.hot_threads: {}

      allowed_warnings:
        - "[/_cluster/nodes/hot_threads] is a deprecated endpoint. Please use [/_nodes/hot_threads] instead."
        - "[/_cluster/nodes/hotthreads] is a deprecated endpoint. Please use [/_nodes/hot_threads] instead."
        - "[/_nodes/hotthreads] is a deprecated endpoint. Please use [/_nodes/hot_threads] instead."

That, then, is the status quo.

Why is this problem rare?

I reviewed all of our classes that define instances of either DeprecatedRoute or ReplacedRoute. In almost all cases, the deprecated path is no longer present in the REST API json spec. For example, when we removed the _xpack URI base from Machine Learning in #36315, we changed the path definitions rather than including new ones alongside the deprecated ones. When we renamed "data frame" to "transform", we treated the deprecated APIs as distinct from the new ones in the JSON spec. For example, GET /_data_frame/transforms is defined in data_frame_transform_deprecated.get_transform.json, while GET /_transform is defined in transform.get_transform.json. A test writer can then invoke these paths separately, though in point of fact I don't believe we do test the deprecated data frame APIs.

Are we able to test deprecated paths at all? Yes — it's almost trivial to do so by using Java rather than YAML, testing against the RestAction class directly by extending RestActionTestCase. But this tests deprecated paths as defined in the Java source rather than deprecated paths as defined in the REST API spec.

Is it important to define "replaced" paths in the REST API spec?

I can see the case for including API paths that are deprecated without replacement in our REST API spec. As long as the API is there, we should be able to test that it works.

But is there an advantage to having deprecated and replaced paths in the JSON spec? One would hope that a client generated from the REST spec would ignore the deprecated path in case of the replaced one. The only thing I can imagine is that we might want the path there for the sake of generated documentation.

Options

Here are some things we could do to resolve this issue.

  1. Nothing. Leave it alone and let allowed_warnings be the 80% solution. Test deprecated paths explicitly in Java code.
  2. Remove all deprecated-but-replaced paths from the JSON API spec, if they serve no purpose. Downside: if this is all we do, the problem could creep back in.
  3. Detect "mixed path deprecation cases" and react automatically. Reaction options: a. Throw an informative exception and fail any test that would trigger this issue b. Log a warning that points the dev towards the allowed_warnings workaround c. Always choose non-deprecated paths when possible, as proposed earlier in the thread.
  4. Choose non-deprecated paths by default, but add a flag that lets a test writer prefer deprecated paths in order to test warnings explicitly. I have tried to do this, and what I have is messier than I want it to be.

My preference is (3c): always prefer non-deprecated paths when choosing a best path. We could log a warning in this case. This lets us leave the "mixed deprecation" cases in the REST API JSON spec to support generated documentation.

However, if we decide that we should never have "mixed deprecations" in the JSON spec, I believe we should fail fast with an informative message.

williamrandolph commented 4 years ago

Discussed this in 27 May's team meeting. We are going to seek feedback from the Clients team, since their projects consume the REST API spec, and depending on their feedback we will either fail on the ambiguous edge case or log an informative warning.

williamrandolph commented 4 years ago

A quite important point that was raised in the meeting is that we can test arbitrary paths explicitly in our YAML tests, so we can still verify deprecation warnings for specific paths if we need to.

williamrandolph commented 4 years ago

I spoke with the Clients team and got some pushback on the idea of removing paths from the spec. They prefer to leave the deprecated-and-replaced paths in the spec as a kind of documentation.

Their code generators effectively do 3c:

We follow a similar fall through approach as proposed in the thread to pick the best non deprecated path in case of 2 urls with the same path parameters.

@Mpdreamz had the idea of a request block instead of a do block. This would be similar to the raw requests that are parsed out of our documentation examples. The docs tests use a special subclass to create "raw" API requests against a particular path, so it's not something we can support in our current rest test syntax — though I don't think it would be difficult to add it.

@jakelandis pointed out that testing deprecated paths may be important for compatibility mode. Whatever we do, we need to make sure that it works for the versioned API effort.

Since this question intersects with a few teams' interests, I will put it to a wider group and report back.

elasticsearchmachine commented 6 months ago

Pinging @elastic/es-delivery (Team:Delivery)