Open miltonhultgren opened 1 year ago
Pinging @elastic/infra-monitoring-ui (Team:Infra Monitoring UI)
If Elastic were to formulate a policy around this my suggestion would be, "Elastic Observability does not under any circumstances support dropping the event.dataset
field from events."
I added this to an internal epic where we're researching how best to respond to missing data, since we should do something other than displaying a blank screen.
The flexibility of Elasticsearch makes using it a hostile environment (by that I mean we can't rely on any field really being there, even if it's in the mapping, etc.) and we have to have defensive code that responds gracefully to unforeseen circumstances. A blank screen is not acceptable but "some of these documents have no event.dataset and this UI is not going to work" is fine.
Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Based on a recent support escalation, we found a situation where a customer had been dropping a key field (
event.dataset
) which the Metrics UI uses to decide which metrics to show in the Node Details page, causing the UI to simply break but without any errors.The flow looks like this:
/metadata
endpoint for which features are "active" (for what are we collecting metrics data)event.dataset
to inform us of which metricsets are being collected and thus which metrics we should have availableIn the case where
event.dataset
is absent, no features are reported (though in this case all the metric fields still existed)./metadata
returns a200 OK
with an empty features list, which is translated to a request for no metrics, which also returns a200 OK
and the UI is simply empty.I can see two ways to solve this, re-work the feature check to search for individual metric fields instead of the metricset. This is probably a lot of work, and might break some of the domain logic the Metrics UI tries to encapsulate. The other way is to treat no features as an error (assuming there is data), which should be a much smaller effort.
AC
/metadata
throw an error if no features are found when there is metrics data present