If an anomaly detection job with a categorizer goes into a hard memory limit it cannot create new categories.
If the categorizer is configured with stop_on_warn: true then the events that would have formed a new category are not passed downstream to be analysed.
Therefore, the event count in the final bucket result does not include events skipped by the categorizer
And the Elasticsearch delayed data detector identifies these bucket results as missing data because the bucket event count does not match the number of documents submitted to the process.
In all cases when a job goes into hard limit that should be addressed first. The mismatch in counts is a symptom of the hard limit and will be fixed by giving the job more memory. However, it is confusing that the counts don't match and users may not realise that new categories cannot be created
stop_on_warn: true
then the events that would have formed a new category are not passed downstream to be analysed.In all cases when a job goes into hard limit that should be addressed first. The mismatch in counts is a symptom of the hard limit and will be fixed by giving the job more memory. However, it is confusing that the counts don't match and users may not realise that new categories cannot be created