Open tylersmalley opened 3 years ago
Pinging @elastic/kibana-telemetry (Team:KibanaTelemetry)
Telemetry usage collectors are only responsible to "collect" the data to send it to the consumer.
Collectors are only called when needed (telemetry, beats, example flyout, etc). The issue here is with the tasks registered to store usage data for the telemetry usage collectors to use when called.
We can address these by:
The telemetry team also keeps the UI_metrics and application_usage working even if telemetry is not enabled. We can modify this logic as well to reduce footprint when telemetry is disabled.
We actually discussed disabling data collection a while back and it looks like it's time to dig into it again.
With Telemetry Next on the horizon, the possibility of sending data more frequently and dynamic mapping, we have more options than we did before. @Bamieh we should dust off the whole 'channel' idea we had earlier this year when Pulse was still on the cards and see as a team what we could do with some of those ideas again.
In the mean time, I'm sure teams wouldn't mind if the Telemetry team provided tools to store usage data too 😉
Might be related: https://github.com/elastic/kibana/issues/77214
We can use this to monitor usage of telemetry for each plugin
There is a very early initiative as part of the getting started project around suggestions. I think the collection of telemetry (or at least some of it) could play a role here. As we fine tune requirements, I'd like to propose we pause prioritization of this effort. cc: @alexh97 @joshdover @TinaHeiligers
@afharo are there any significant performance or resource problems we should address if we're not going to work on this in the near-term?
I don't think there is. If telemetry collection is not triggered, the only used resources that could be released are:
/api/lens/stats
/api/ui_metric/report
Overall, I'd say the resource impact is assumable. The main reason for opening this issue was to reduce noise and users being annoyed that despite them disabling telemetry, some telemetry
words appeared in the logs, and also avoiding confusion in support cases.
Saved Objects: All the saved objects used to store the results from the 2 items above so the collection fetchers can retrieve them when requested.
I know in the past we've had issues with large numbers of SOs being used for this data. Is this still a problem in newer releases?
I don't think there is at the moment: The culprit was Application Usage and https://github.com/elastic/kibana/pull/77610 introduced a 30-minutes rollup process that aggregates the transactional entries into daily entries. And the daily entries are rolled-up after 90 days to historic total
documents. This makes it the max-worse-case-scenario total documents equal to NumberOfApps * ( 1_totalDoc + 90_dailyDocs + (10_transactionalDocs * numberOfConcurrentUsers) )
, assuming all the apps are used by all the users every 3 minutes (the transactional reporting period).
Following the recent implementation of UI Counters, we can revisit the logic above to follow a similar approach: UI Counters stores/increases daily-aggregated counters, and keeps them for only 3 days. The 7/30/90days/total aggregation will happen later on in the Remote Telemetry Service. This would eventually reduce the total number of documents to NumberOfApps * 3_dailyDocs
. I would delay these changes, though, until the Remote Telemetry Service devs can help with the transition in the way we report.
Pinging @elastic/kibana-core (Team:Core)
This was originally brought to my attention on Discuss but additional conversations have been had on Slack since.
Per @Bamieh, Beats is using the collects currently so they can not be disabled. APM is using a separate task that has a separate config to provide the ability to disable.
I feel it's important to disable this backgrounding if the data is not being used as it consumes both the resources of the Kibana event loop, Elasticsearch and saved objects.