cilium / design-cfps

Repo to store Cilium CFP design docs
Apache License 2.0
21 stars 19 forks source link

Add CFP-30788 Managing Hubble Metrics cardinality #29

Closed vakalapa closed 2 months ago

vakalapa commented 3 months ago

This proposal aims at adding a new dynamic mechanism to manage Hubble metric cardinality.

chancez commented 3 months ago

Overall idea makes sense to me. We've got a few people on the hubble team who are on holiday this week; so hopefully next week they can take a look and we can move forward.

chancez commented 2 months ago

@kaworu: @chancez While I agree conceptually, disabled: false is difficult to read and would be inconsistent with the Hubble flow exporter.

Fair enough, so then should we just remove the option from the dynamic metrics config entirely? Then to disable a given metric option it just needs to not be configure in the list all? Eg to have the feature enabled, but no metrics configured:

       hubbleMetrics:
            metrics: [] # no metrics enabled currently. 
vakalapa commented 2 months ago

@kaworu: @chancez While I agree conceptually, disabled: false is difficult to read and would be inconsistent with the Hubble flow exporter.

Fair enough, so then should we just remove the option from the dynamic metrics config entirely? Then to disable a given metric option it just needs to not be configure in the list all? Eg to have the feature enabled, but no metrics configured:

       hubbleMetrics:
            metrics: [] # no metrics enabled currently. 

That sounds reasonable, check for the presence of this new configmap and/or the lenght of metrics to determine if the feature is enabled. Removing the enabled/disabled section from CFP.

vakalapa commented 2 months ago

@kaworu Thank you for the review,

  1. Did you consider a CRD instead of a ConfigMap to configure metrics? The question was discussed for the Hubble exporter, and I feel like an explanation of why the CFP propose one or the other (or something else) would make sense in the "Implementation Alternatives" section.

Ack, i want building on the previous precedence of config maps, but open to discussion on this. I believe configmaps are better solution compared to CRDs as lined out in @chancez 's comment. Added my views to alternatives section.

  1. It's unclear to me whether the CFP plan to deprecate/remove the cilium's config hubble-metrics. IMO we don't want to maintain the two long-term, and thus we're missing a migration path (or at least suggestion) from the current hubble-metrics mechanism to the proposed hubble-metrics-config mechanism in the "Impacts / Key Questions". Are we going to ship / document hubble-metrics-config for the existing dashboards?

That is a great point, we should deprecate the older path, but i am not familiar about the deprecation policy/process of cilium? If there are any pointers on how this is normally handled, i am happy to add it to "Future Milestones". My assumption is that it will take atleast one or two minor version upgrades for existing behavior get deprecated. I.e. if this support lands in 1.16 cilium, by 1.18 existing configuration via cilium-config will be deprecated?

The second question goes back to the first where if the new system is CRD driven we could have a CRD per dashboard in addition to user defined metrics CRD, although maybe we would generate too many metrics and be back to square one, hence the request to clarify in the CFP.

As Chance previously discussed, this kind of functionality can be easily build on top of the configmap approach where a operator model can be created on top of this configmap to handle such advanced scenarios ?

vakalapa commented 2 months ago

This is a fair assumption, I think a couple of version should be enough to validate the configMap approach and implementation, and at that point we could start on deprecating the cilium-config metrics configuration.

thank for the review, I added the same at the bottom of the proposal stating, we will re-visit deprecation in 2 versions' time.

xmulligan commented 2 months ago

Last thing before merging, can you create and put this in a /hubble folder rather than the /cilium folder?

chancez commented 2 months ago

Good idea. @xmulligan @vakalapa once it's moved I'll merge the CFP.

vakalapa commented 2 months ago

Done, moved CFP to hubble folder. @chancez

joestringer commented 2 months ago

Heads up that I moved this under cilium/ to match the documented process: https://github.com/cilium/design-cfps/pull/35 .

joestringer commented 2 months ago

Ah @xmulligan I didn't see your request there. If we're going to split PRs up like that, then it'll make sense to update the README.md to describe exactly how we expect developers to make those decisions so we can consistently apply the rule :)

xmulligan commented 2 months ago

ahh you are right sorry about that. I guess I expected the issue to be in the Hubble repo. Thanks for changing