Open iansltx opened 1 month ago
@iansltx Thanks for filing this! Since this involves changes in the statistics we report, I want to run it through the drafting board before prioritizing to the release board.
@noahtalerman All of the information Ian's is proposing will be readily available, so this should be a small effort.
num scripts run due to a policy failure, num installs initiated due to a policy failure
@iansltx how could collecting the number of runs help us?
I'm assuming this is an absolute count over the entire history of a Fleet instance.
User story |
---|
As a Fleet engineer, |
I want to have visibility on how much usage policy automations get for installs and script runs |
so that I can optimize performance and UX to support customer use cases. |
This smallish task will help ensure we build policy automations in a way that customers using Fleet Premium get the best experience out of those features.
For telemetry stats, include counts for:
Once #22424 is implemented, all of the above should be trivially queryable from exisitng database tables.
ℹ️ Please read this issue carefully and understand it. Pay special attention to UI wireframes, especially "dev notes".
@noahtalerman Yep, absolute count.
This would inform whether e.g. we're seeing customers use policy automations for patch management or other activities that are expected to fail as part of bringing a fleet of machines into spec. This would have implications for e.g. prioritizing #22920 or other UX improvements where policy failures are routine/expected rather than exceptional.
We could implement policy failure count telemetry on a rolling "since X days ago" basis as well and get similar information. Just a matter of matching how the telemetry is aggregated/displayed, and I'm thinking that running totals would be easier to manage from a metrics display perspective (but a bit heavier on the DB for collection of the data).