ManageIQ / manageiq

ManageIQ Open-Source Management Platform
https://manageiq.org
Apache License 2.0
1.34k stars 899 forks source link

Metrics Capture Region Configuration is Confusing #21511

Open agrare opened 2 years ago

agrare commented 2 years ago

By default all C&U collection is disabled until the user goes to the Configuration / Settings / ManageIQ Region "Region 0 [0]" page, C&U Collection tab, and flips Collect for All Clusters from the default No to Yes

6a461800-3186-11ec-80ea-47817215a84d

While this is in the documentation it is extremely unintuitive and results in a great many "why don't I see metrics" bugs being raised.

The saner default would seem to be default to capture everything, allow the user to fine-tune if they know what they're doing rather than requiring that they know what they're doing to be able to capture anything.

Fryguy commented 2 years ago

:100: this has always bugged me.

agrare commented 2 years ago

This is handled by tagging the region with "capture_enabled" https://github.com/ManageIQ/manageiq/blob/master/app/models/miq_region.rb#L268 so not as simple as just switching a default from false to true...

Fryguy commented 2 years ago

Wonder if we can drop that tag entirely or perhaps invert it?

agrare commented 2 years ago

Yeah I'm thinking inverting it would also help reduce the number of tags on the clusters/hosts (assuming the default is capture_enabled)

(NOTE this method is also involved https://github.com/ManageIQ/manageiq/blob/master/app/models/metric/ci_mixin/targets.rb#L2-L16)

chessbyte commented 2 years ago

This stems from the early days, when ManageIQ would get indigestion processing too many metrics. I don't believe that is the case with today's code, so 👍 on defaulting to capturing all the metrics when metric capture is enabled.

kbrock commented 2 years ago

Would be nice to change our ems fetching queries to not need so many includes() when their sole purpose is to answer whether a vm or ems is tagged to collect metrics.

Or, would be nice to change this away from our current model and just fetch all metrics in a single go.

Fryguy commented 2 years ago

Adding this to the Roadmap...I think we should do this for next release.

miq-bot commented 1 year ago

This issue has been automatically marked as stale because it has not been updated for at least 3 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions! More information about the ManageIQ triage process can be found in the triage process documentation.