elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.72k stars 8.14k forks source link

Disabling telemetry should disable collectors #80706

Open tylersmalley opened 3 years ago

tylersmalley commented 3 years ago

This was originally brought to my attention on Discuss but additional conversations have been had on Slack since.

Per @Bamieh, Beats is using the collects currently so they can not be disabled. APM is using a separate task that has a separate config to provide the ability to disable.

I feel it's important to disable this backgrounding if the data is not being used as it consumes both the resources of the Kibana event loop, Elasticsearch and saved objects.

elasticmachine commented 3 years ago

Pinging @elastic/kibana-telemetry (Team:KibanaTelemetry)

Bamieh commented 3 years ago

Telemetry usage collectors are only responsible to "collect" the data to send it to the consumer.

Collectors are only called when needed (telemetry, beats, example flyout, etc). The issue here is with the tasks registered to store usage data for the telemetry usage collectors to use when called.

We can address these by:

  1. Manually going over every usage task and only enabling them if telemetry is enabled.
  2. Telemetry team provides tools for teams to store usage rather than just methods to send them.

The telemetry team also keeps the UI_metrics and application_usage working even if telemetry is not enabled. We can modify this logic as well to reduce footprint when telemetry is disabled.

TinaHeiligers commented 3 years ago

We actually discussed disabling data collection a while back and it looks like it's time to dig into it again.

With Telemetry Next on the horizon, the possibility of sending data more frequently and dynamic mapping, we have more options than we did before. @Bamieh we should dust off the whole 'channel' idea we had earlier this year when Pulse was still on the cards and see as a team what we could do with some of those ideas again.

In the mean time, I'm sure teams wouldn't mind if the Telemetry team provided tools to store usage data too 😉

Bamieh commented 3 years ago

Might be related: https://github.com/elastic/kibana/issues/77214

We can use this to monitor usage of telemetry for each plugin

alexfrancoeur commented 3 years ago

There is a very early initiative as part of the getting started project around suggestions. I think the collection of telemetry (or at least some of it) could play a role here. As we fine tune requirements, I'd like to propose we pause prioritization of this effort. cc: @alexh97 @joshdover @TinaHeiligers

joshdover commented 3 years ago

@afharo are there any significant performance or resource problems we should address if we're not going to work on this in the near-term?

afharo commented 3 years ago

I don't think there is. If telemetry collection is not triggered, the only used resources that could be released are:

Overall, I'd say the resource impact is assumable. The main reason for opening this issue was to reduce noise and users being annoyed that despite them disabling telemetry, some telemetry words appeared in the logs, and also avoiding confusion in support cases.

joshdover commented 3 years ago

Saved Objects: All the saved objects used to store the results from the 2 items above so the collection fetchers can retrieve them when requested.

I know in the past we've had issues with large numbers of SOs being used for this data. Is this still a problem in newer releases?

afharo commented 3 years ago

I don't think there is at the moment: The culprit was Application Usage and https://github.com/elastic/kibana/pull/77610 introduced a 30-minutes rollup process that aggregates the transactional entries into daily entries. And the daily entries are rolled-up after 90 days to historic total documents. This makes it the max-worse-case-scenario total documents equal to NumberOfApps * ( 1_totalDoc + 90_dailyDocs + (10_transactionalDocs * numberOfConcurrentUsers) ), assuming all the apps are used by all the users every 3 minutes (the transactional reporting period).

Following the recent implementation of UI Counters, we can revisit the logic above to follow a similar approach: UI Counters stores/increases daily-aggregated counters, and keeps them for only 3 days. The 7/30/90days/total aggregation will happen later on in the Remote Telemetry Service. This would eventually reduce the total number of documents to NumberOfApps * 3_dailyDocs. I would delay these changes, though, until the Remote Telemetry Service devs can help with the transition in the way we report.

elasticmachine commented 3 years ago

Pinging @elastic/kibana-core (Team:Core)