[ML] Solutions: Request for mock ML job data for testing

alvarezmelissa87 commented 3 years ago

Request for mock ML job data for use in plugins integrating with ML.

As we work toward being more Solutions-oriented, more plugins are integrating with ML. It would be great to provide mock data for plugins to use when they create their own functional tests - there is no need for them to have to actually go through the whole ML job flow as we cover that in the ML plugin tests.

cc @pheyos

elasticmachine commented 3 years ago

Pinging @elastic/ml-ui (:ml)

jasonrhodes commented 3 years ago

Hello! When we want to test anomalies and other ML integrations in Observability, we often just want to make sure our graphs are displaying the right way based on expected data. To that end, it'd be great to have a way to access fixtures of well-tested, maintained API response data for various endpoints that we as solutions interact with, so we can use those instead of setting up real jobs, waiting for them to produce results, etc.

In addition, it would also be nice to have document templates of some kind, or some other way to write data into the right indices so that these API endpoints produce the desired results, for times when we want to set up a cluster that is already in a given state rather than hoping and/or waiting for ML jobs to produce that state reliably.

I think these are two different use cases and I'm happy to provide more information on both! Thanks!!

jasonrhodes commented 3 years ago

Update: it sounds like since there doesn't appear to be an API for retrieving anomalies, the mock data is probably not going to be helpful for our anomaly cases. We appear to be using the mlAnomalySearch method which looks like it is mostly an ES client so we are probably going to need document templates more so that we can fill ES with anomaly data.

weltenwort commented 3 years ago

For added context, we are already able to inject anomaly data by manually loading documents into the results index. It would be nice, though, to somehow generate them such that they are guaranteed to be consistent.

An even bigger problem is to get the jobs themselves into the desired state for testing. This includes, but is not limited to

forcing specific stages in the life-cycle
forcing a "memory limit reached" situation
forcing categorizer warnings to occur

jgowdyelastic commented 3 years ago

cc @pheyos

sophiec20 commented 3 years ago

it would be nice, though, to somehow generate them such that they are guaranteed to be consistent.

For any ML mock data exercise to have a chance of being effective, we would need mock data coming out of the agent(s) so we can close the circle. Do we know if this is available?

weltenwort commented 3 years ago

In the case of logs we can generate documents or load them from a fixture.

pheyos commented 3 years ago

Thanks for the feedback! We had a few discussion around this topic and here's a quick summary:

First of all: We understand the need for good and reliable test data, we're all in the same boat. But currently, we don't see a way without downsides, so this is just a first step and we'll continue to search for a better workflow (acknowledging this is not ideal).
We would like to avoid snapshoting .ml-* indices. The reason for this is that there are no guarantees about the structure of these indices, they're basically an implementation detail that we shouldn't rely on in functional UI tests. Things can easily change in these indices, even across minor versions, so it would be challenging to keep the snapshots up to date.
We don't want to mock API responses in end-to-end tests for a similar reason: the real API response could change and the tests wouldn't catch that as it's still running with the old mock data.
For the moment, our suggestion is to actually run the ML jobs during tests with prepared test data and job configuration such that they give the expected results. As some details like anomaly scores can easily vary between runs, it's best to not check for exact values, but rather for e.g. there's a critical anomaly in the time range X.
There are already a number of helper methods in the ml.api service, that let you create and run ML jobs as well as checking job states and results. If you need additional methods in there, let us know.
We will work on adding more datasets and job configurations to cover more use cases. This will be on a general basis and not necessarily suit all solutions, but we still expect to see some synergies.
We will create a documentation with a collection of tips and tricks how to e.g. downsample datasets for use in functional tests and how to modify data and/or configure the jobs to force certain behavior (like an anomaly in a certain time range or a job state). This will help solutions teams who are experts on their data to prepare their tests.
Everyone is welcome to reach out if they need help in this process.

jasonrhodes commented 3 years ago

We don't want to mock API responses in end-to-end tests for a similar reason: the real API response could change and the tests wouldn't catch that as it's still running with the old mock data.

I understand this concern, but I wonder if it may be time for us to consider our Kibana APIs to be more than hidden implementation details and treat them as exposed APIs that are subject to some amount of backwards-compatible version control? I know that in Logs and Metrics, we will need to run tests that don't rely on running ML jobs. The idea with this ticket, from our end, was to avoid this very problem of stale data because the mocks are controlled by the ML team directly, and updated as part of the overall process.

elastic / kibana

[ML] Solutions: Request for mock ML job data for testing #76660