This is a follow up to #18349 (Client side job validation for v6.3), where the general validation framework and initial checks have been implements. This issue tracks follow-up updates for job validation.
Framework
[ ] Add a skip option to the API endpoint to skip certain tests
[x] Job validation should be more helpful in explaining why checks passed #19068
Checks
[ ] Revisit cardinality evaluation code and improve the logic concerning by/partition/over fields to determine cardinality
[ ] Check cardinality of fields including model memory estimation
[ ] Check if bucket span significantly different from bucket-estimation
[ ] Check sparse-ness of data
if sparse, use a sparse-aware function (overlap with bucket span estimator which would suggest a longer bucket)
[x] Check if using scripted fields, don't report them as not being aggregatable #21205
[ ] If using scripted fields, warn that it is not possible to display the anomaly charts
[ ] Check for summary_count_field
if metric is non-zero integer and we are using a sum function, then perhaps this is actually a summary_count_field
[ ] Check for mix of detectors
if job contains both rare and metric detectors, warn that you might get better results by splitting into two jobs (tbc - analysis pending)
if job has many different over_fields, warn that you might get better results by splitting jobs
[ ] Check if the selected timespan contains any data and/or if there's additional data outside the selected timespan
[ ] Check if index names are suitable for ML analysis (e.g. prefix wildcards)
[ ] Check if summary count field is numeric, see #19114
[ ] Check if both categorization_filters and a categorization_analyzer are configured. If so then the message could be "Categorization filters are not permitted with a categorization analyzer. Instead add a char_filter within the categorization_analyzer."
would be good to do, although it will only be an estimate based on data seen
Estimate resource usage
if high cardinality, low bucket_span, with many detectors/influencers and depending on function then we can warn if we expect the job to be a resource intensive one
Finally, we could provide an example of the sort of results to expect. This is already somewhat covered by the simple jobs wizards but is lacking from adv job config. We can provide both pictorial and language descriptions for the analysis..
e.g. language descriptions (pseudo config)
Models the sum(bytes) for each Host
Detects unusual behavior for a Host compared to its own past behavior
Gives greater significance if many Hosts are unusual together or
Models the sum(bytes) for the populations of Hosts
Detects unusual behavior for a Host compared to the past behavior of the population
Original comment by @walterra:
This is a follow up to #18349 (Client side job validation for v6.3), where the general validation framework and initial checks have been implements. This issue tracks follow-up updates for job validation.
Framework
skip
option to the API endpoint to skip certain testsChecks
by/partition/over
fields to determine cardinalityBugs
Additional Checks take over from #18074
Finally, we could provide an example of the sort of results to expect. This is already somewhat covered by the simple jobs wizards but is lacking from adv job config. We can provide both pictorial and language descriptions for the analysis..