Open adriansr opened 2 years ago
The ingest processor's failed metric is incremented for processors that have ignore_failure set.
ignore_failure
This can lead to misleading reports / alerts:
Actual processor stats:
{ "metric_type": "processor", "pipeline": "filebeat-7.17.3-sophos-xg-firewall", "count": 13597705, "time": "16.1m", "time_in_millis": 967520, "current": 0, "failed": 6887511, "calculated": { "execution_time_avg_ns": 71153.18357031573, "failed_pct": 0.5065201076211022 }, "processor_index": 76, "processor_type": "lowercase", "processor_name": "77_lowercase", "stat_name": "lowercase", "conditional": false, "definition": "{\"field\":\"network.protocol\",\"ignore_failure\":true}", # <- here }
Alert from diagnostics tool:
pipeline_stats:Ingest pipeline is reporting over 1000 failures.
I'd like to suggest not incrementing this metric for processors where the failure is either ignored or handled by the processor's own on_failure.
on_failure
Pinging @elastic/es-data-management (Team:Data Management)
Description
The ingest processor's failed metric is incremented for processors that have
ignore_failure
set.This can lead to misleading reports / alerts:
Actual processor stats:
Alert from diagnostics tool:
I'd like to suggest not incrementing this metric for processors where the failure is either ignored or handled by the processor's own
on_failure
.