elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.06k stars 24.52k forks source link

Stop accessing System Indices directly in REST tests #62501

Open gwbrown opened 3 years ago

gwbrown commented 3 years ago

As part of formalizing System Indices in Elasticsearch (https://github.com/elastic/elasticsearch/issues/50251), we will be restricting access to System Indices via the REST interface, except by means of purpose-built APIs. For example, accessing .watches directly will not be allowed - all interaction with the .watches index via the REST API should use the purpose-built Watcher APIs.

If it is necessary to access System Indices via the REST API today, an API should be added to fulfill that need rather than using generic APIs to do so.

There are a number of places in our REST tests which access System Indices directly. These tests will be modified by https://github.com/elastic/elasticsearch/pull/60945 to avoid failures by consuming the deprecation warning as necessary, but that is not a long-term solution. The tests below should be evaluated to determine if there is a necessary API that is currently missing, and if so, add that API and convert the test to use it. If not, we should determine an alternate method for testing the functionality which does not directly access System Indices via the REST API.

Tests/test-related methods which currently access System Indices directly via the REST API:

Test Framework

Tasks

Transforms

ML

Watcher

Security

elasticmachine commented 3 years ago

Pinging @elastic/es-core-infra (:Core/Infra/Build)

elasticmachine commented 3 years ago

Pinging @elastic/es-security (:Security/Security)

elasticmachine commented 3 years ago

Pinging @elastic/es-core-features (:Core/Features/Watcher)

elasticmachine commented 3 years ago

Pinging @elastic/ml-core (:ml)

elasticmachine commented 3 years ago

Pinging @elastic/es-distributed (:Distributed/Task Management)

elasticmachine commented 3 years ago

Pinging @elastic/ml-core (:ml/Transform)

hendrikmuhs commented 3 years ago

As for some of the transform tests: Some tests challenge the robustness of the system (e.g. beEvilAndDeleteTheTransformIndex), another verifies mappings. I am not arguing that all transform tests should be kept as they are, but at least some have a good reason(the others should avoid using system indices).

It would be good to have a way to mark single tests as valid case, similar to @SuppressWarnings (that excludes yaml tests, I think that's fine).

Mpdreamz commented 3 years ago

cc @elastic/clients-team this will affect us as well. expand_wildcards won't be allowed to touch system indices anymore either so a new dedicated API to delete all system API's needs to be introduced so we can reliably run yaml rest spec setup/teardown.

gwbrown commented 3 years ago

Some tests challenge the robustness of the system (e.g. beEvilAndDeleteTheTransformIndex)

Validating the behavior of the system in the event of a user doing something they shouldn't like this is a valid use case for either using this parameter and/or ignoring deprecation warnings as necessary for as long as this behavior is allowed. However, our intent is to disable access to System Indices via REST entirely, so these tests would no longer be valid at that time (because the user can no longer be evil and delete the transform index). In that case, we should note that the test case can be removed when it is no longer possible to access System Indices via REST.

another verifies mappings

These cases still run into the same issue - eventually, it will not be possible to access these indices via the general-purpose index APIs. For these, I can think of a few options - we could either move the mapping verification into Elasticsearch and expose it as an endpoint (e.g. POST _ml/verify_index_mappings), or add a special purpose API that returns the mappings of these indices.

droberts195 commented 3 years ago

so we can reliably run yaml rest spec setup/teardown.

This raises a very important question. At the moment all indices are removed in between YAML tests. This is done in the test client. What is the plan for system indices in between YAML tests? Will they eventually not be cleaned up in between YAML tests like user-visible indices? Or will there be a special API to delete all system indices in between YAML tests? Basically, the problem Martijn outlined doesn't just affect the language clients maintained by the clients team but also the test harness used within the Elasticsearch repo, and the solution should be the same for both.

gwbrown commented 3 years ago

System Indices will be wiped between REST tests (both YAML and Java) at the same point they are today. Exactly how we're going to implement this when the allow_system_index_access flag is removed isn't fully nailed down yet, but it's likely to take the form of a special API to delete all system indices, sort of a "factory reset"-type API. In any event, we (the team implementing System Indices functionality, see assignees of https://github.com/elastic/elasticsearch/issues/50251) will handle any API changes necessary to Elasticsearch and the test infrastructure in the Elasticsearch repo to clean up System Indices as they are today and ensure the same mechanism is exposed for the Clients team.

Reworking the tests to not wipe system indices between tests would be a large undertaking, and has not been something we've even seriously considered thus far.

hendrikmuhs commented 3 years ago

I think a "factory reset" API should be added as a must to the meta issue before eventually locking the access to system indices. We see users doing a manual reset by deleting the indices very often in ML. A challenge to that: For a factory reset any jobs(ml/transform/rollup/...) must be stopped. You also might not want to reset everything but only parts. I think it makes sense to have a reset per feature and a global reset that calls all of them.

droberts195 commented 3 years ago

I like the idea of a "factory reset" API. That would make our cleanup in between tests as simple as "call the factory reset API". At the moment we have lots of bits of cleanup code that are written differently for internal cluster tests, REST tests and integration tests that use the node client. If that logic was moved into Elasticsearch server-side then the client-side cleanup code would be trivial. This would also help the language clients in the long run (after the initial pain of switching over). At the moment they have to try to mimic the various bits of cleanup code we run in every language.

A challenge to that: For a factory reset any jobs(ml/transform/rollup/...) must be stopped.

Yes agreed.

To properly "factory reset" would involve more than simply deleting all the system indices. In fact, having a documented API that just deletes all system indices could cause problems. For ML if system indices are deleted before persistent tasks are cancelled then those persistent tasks become hard to deal with. Also, ML has a mix of hidden and system indices, and deleting the system indices while leaving the hidden indices behind will cause weird effects if ML is used again afterwards.

So a factory reset would look something like:

  1. Cancel all persistent tasks
  2. Wait for all pending tasks to complete (so the side effects of cancelling the persistent tasks don't interfere with the index deletion)
  3. Delete all hidden indices whose names begin with .
  4. Delete all system indices
gwbrown commented 3 years ago

@hendrikmuhs @droberts195 I've opened https://github.com/elastic/elasticsearch/issues/62778 to cover that issue, and linked it from the ESRestTestCase.wipeAllIndices point above. Let's continue discussion over there. I'll also add this as a topic in this week's System Indices sync.

mark-vieira commented 3 years ago

FYI, I'm removing the Delivery team label here since the focus here is on specific test implementations not testing infrastructure.

elasticsearchmachine commented 1 year ago

Pinging @elastic/ml-core (Team:ML)

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-security (Team:Security)

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-distributed (Team:Distributed)

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-data-management (Team:Data Management)