nutanix-cloud-native / cluster-api-runtime-extensions-nutanix

https://nutanix-cloud-native.github.io/cluster-api-runtime-extensions-nutanix/
Apache License 2.0
7 stars 4 forks source link

build(deps): Upgrade CCM Version to v0.4.0 #836

Closed thunderboltsid closed 1 month ago

thunderboltsid commented 1 month ago

https://github.com/nutanix-cloud-native/cloud-provider-nutanix/releases/tag/v0.4.0

dkoshkin commented 1 month ago

How confident are we with this version? I'm a bit concerned about possible regressions.

thunderboltsid commented 1 month ago

How confident are we with this version? I'm a bit concerned about possible regressions.

I'm more confident with this version than without it. This contains changes for the long lived client cache w/ session auth. Without this we might end up in the same issue we did with CAPX DOS-ing PC IAM APIs.

thunderboltsid commented 1 month ago

I am a little skeptical to introduce this at this late stage. We should not change at this time if its not broken till we can observe it. Given CCM only runs on single cluster, and till we have prometheus counters to prove the count of APIs are high, I agree that we should atleast fix the part of using token instead of basic auth in general. this fix also adds cache changes which technically will have only one cluster in it.

Also we need to identify the tests we need to do in order for ccm to get tested after every upgrade thru caren.

I agree that it's coming in later than it should have. With that said, I'd like to make a case for why it is needed. The current helm chart version (v0.3.4) points to a released image that is over 1y old. The main functional change this bump brings about is introducing session auth changes to make sure we don't DOS PC with Basic auth requests. While we can be pedantic about correlating with instrumented metrics, in their absence, the best effort testing we did a couple of months with session auth changes on CAPX still showed meaningful difference in PC request metrics. Now, as for testing: