Open dmeytin opened 3 years ago
WDYT @zroubalik @jeffhollan @ahmelsayed @anirudhgarg ?
controller-runtime support for multi-cluster operators is still in progress. The latest versions have some basics but probably not ready for full use yet. I would very much avoid spending time on integrating with kubefed tooling given it's level of community uptake has remained extremely low, and last I heard the SIG was focusing on a kubefed2 rebuild.
That is correct , but the link below is to kubefed2 project. Yet, I believe it could be a great showcase for expansion on multiple clusters. Queue workers are significant part of the modern workloads but it is left without appropriate treatment for multiple cases of cluster’s day two operations.
On Wed, 10 Feb 2021 at 22:29 Noah Kantrowitz notifications@github.com wrote:
controller-runtime support for multi-cluster operators is still in progress. The latest versions have some basics but probably not ready for full use yet. I would very much avoid spending time on integrating with kubefed tooling given it's level of community uptake has remained extremely low, and last I heard the SIG was focusing on a kubefed2 rebuild.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kedacore/keda/issues/1587#issuecomment-777008465, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGJ2TWG4OHEEH3SPXRVSW3S6LUD5ANCNFSM4XLLHTJQ .
The overall direction of controller-runtime is for multi-cluster operators to handle things directly (i.e. talk to the API of every cluster) rather than use a secondary federation backend. Not required, of course, but we're explicitly trying to support that use case.
Sounds great. Can you please share please the link to PR that I can follow?
On Wed, 10 Feb 2021 at 23:00 Noah Kantrowitz notifications@github.com wrote:
The overall direction of controller-runtime is for multi-cluster operators to handle things directly (i.e. talk to the API of every cluster) rather than use a secondary federation backend. Not required, of course, but we're explicitly trying to support that use case.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kedacore/keda/issues/1587#issuecomment-777032230, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGJ2TQW2XIRHYWDXVIY7Y3S6LXWDANCNFSM4XLLHTJQ .
There's a lot of little pieces. https://github.com/kubernetes-sigs/controller-runtime/pull/1075 has already happened (there's now a Cluster struct distinct from Manager) and https://github.com/kubernetes-sigs/controller-runtime/pull/1192/files will allow multi-instantiation which is part of the same overarching use case in the end.
Awesome! I would recommend opening a parent feature that will aggregate all pieces together for simplicity of tracking. I'm willing to be the better tester for this feature. In general the solution will not be complete w/o multicluster ingress controller and DNS service. For the full reference implementation we should add these components from the existing 3rd party projects. WDYT?
On Thu, Feb 11, 2021 at 8:59 AM Noah Kantrowitz notifications@github.com wrote:
There's a lot of little pieces. kubernetes-sigs/controller-runtime#1075 https://github.com/kubernetes-sigs/controller-runtime/pull/1075 has already happened (there's now a Cluster struct distinct from Manager) and https://github.com/kubernetes-sigs/controller-runtime/pull/1192/files will allow multi-instantiation which is part of the same overarching use case in the end.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kedacore/keda/issues/1587#issuecomment-777240079, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGJ2TRDVZGOHPNXMRKIMALS6N54PANCNFSM4XLLHTJQ .
Maybe? KEDA is usually scaling some kind of task worker or consumer system. These are not usually the same components as the web services, instead they communicate through some kind of broker which KEDA is monitoring and creating or removing consumers are needed :) The web tier 100% if you want a federated approach, would need the tools you describe. But the kinds of consumer pods/jobs/etc that KEDA is managing are usually separate from that (maybe needing some Service integration for Prometheus metrics discovery but more often you would do that cluster-local and federate at the Prometheus level instead since it has powerful tooling for that already).
There's work being done in the keda-http addon to look at request-based scaling for web services, there would also need that kind of stuff, but it's still early phases so let's get that working in a simpler setup first :D
That is absolutely correct. I agree - it's better to keep components loosely coupled and avoid unnecessary integrations. As an infra provider I will need to glue all components together for the holistic solution and it would be great to ensure that integration will go smoothly.
On Thu, Feb 11, 2021 at 11:30 AM Noah Kantrowitz notifications@github.com wrote:
Maybe? KEDA is usually scaling some kind of task worker or consumer system. These are not usually the same components as the web services, instead they communicate through some kind of broker which KEDA is monitoring and creating or removing consumers are needed :) The web tier 100% if you want a federated approach, would need the tools you describe. But the kinds of consumer pods/jobs/etc that KEDA is managing are usually separate from that (maybe needing some Service integration for Prometheus metrics discovery but more often you would do that cluster-local and federate at the Prometheus level instead since it has powerful tooling for that already).
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kedacore/keda/issues/1587#issuecomment-777309526, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGJ2TSFD4D3NSVEJH43A5DS6OPR3ANCNFSM4XLLHTJQ .
Do we have any news on this issue?
Any thoughts on this @zroubalik?
I am happy to see any POC :)
I have a list of use cases for multi-cluster support:
All use-cases above could be satisfied by having required distribution of the workload across clusters and ability to fill the gap in case when others clusters aren’t complete the request till the timeout
Does it make sense?
On Fri, 9 Jul 2021 at 12:39 Zbynek Roubalik @.***> wrote:
I am happy to see any POC :)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kedacore/keda/issues/1587#issuecomment-877056624, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGJ2TTZOS2IMYCX5KEE2M3TW27VRANCNFSM4XLLHTJQ .
Yeah, that does make sense.
What I'd love to see is some actual proposal on how do we want to achieve this from technical point.
we need to add a few operations: 1) join cluster - responsible to configure KEDA operator of other clusters within the squad with the certificate of the current cluster 2) Scaling object extra configuration
- capacity:
cluster-eus: 20
cluster-eus: 20
cluster-wus: 60
the configuration done on any cluster is immediately synced with the squad (if the configuration is different, the last one will win)
the actual replica size values will be synched with the squad as well
when one of the clusters has a discrepancy between requested capacity and the actual size, the cluster's status will be set as frozen and the required capacity will not be increased
When one of the clusters fails to communicate with the squad, it will be set as unhealthy and its capacity will be decreased to zero.
The frozen cluster will be periodically tested whether it can scale.
For cross cluster data synchronization we can use hashicorp consul.
3) leave cluster This operation will notify the squad that the cluster is leaving
On Mon, Jul 12, 2021 at 12:10 PM Zbynek Roubalik @.***> wrote:
Yeah, that does make sense.
What I'd love to see is some actual proposal on how do we want to achieve this from technical point.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kedacore/keda/issues/1587#issuecomment-878109389, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGJ2TRFGEAWFV4CNJNJT4TTXKWSDANCNFSM4XLLHTJQ .
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
@Tom Kerkhove @.***>, What will happen if KEDA is executed jointly with Admiralty? Maybe it will achieve the requested functionality?
On Thu, Oct 14, 2021 at 9:47 AM stale[bot] @.***> wrote:
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kedacore/keda/issues/1587#issuecomment-943049485, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGJ2TXY6GSWFG7DYVSXOZ3UGZ4J7ANCNFSM4XLLHTJQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
We run in a multi-cluster scenario (for the ability to destroy/recreate individual clusters without impact) and I don't know that this can be addressed outside of a full scheduling manager.
One option is if KEDA knew how many clusters were participating, it could scale reported metrics based on that proportion (ex. with 2 clusters, report metrics at 50% value). This would have ~half the instances on each cluster. KEDA would need to be constantly checking to see of all participating clusters are "working" to know if one of them is down or unable to provision more instance. If it's down then update the cluster count to rescale metrics and load will be distributed to the other clusters. "Unable to scale instance" is harder to detect. A workload at 2/10 is below the threshold, but it's not clear why. Is the cluster hitting some quota that won't ever let that workload scale? Is it waiting for a cluster auto-scale event to get more worker nodes? Is it pulling an image or just waiting for the initial scheduling? Or are the new instances failing to start up? There doesn't seem to be a clear way to generally determine this.
The other option is to set your Trigger values based on expected cluster count and desired resilience. If you expect to have 4 clusters normally and want to be able to tolerate losing one cluster, then scale your Trigger config to 3x of what you want in total.
Example (normally 4 clusters, tolerate losing one): Desired Global Trigger Value: 600 Individual Cluster Trigger Value: 600(4-1) = 6003 = 1800
which results in
Example raw metric value (from source like queue length): 1800 Perfectly efficient global instance count (same as if running on one cluster): 3 Actual global instance count (when 4 clusters running): 4 (one per cluster)
This will result in having an "extra" 33% of instances running when there are 4 clusters but:
Running mulit-cluster implies a level of over provisioning as the tradeoff for decreased risk. If you're capacity is "efficiently" packed so you aren't overprovisioned, then you don't get the risk mitigation of multiple clusters (If you have two clusters, then you need to be 100% overprovisioned to be able to absorb losing one of those clusters).
Of the two methods above, the first (actively discovering and monitoring clusters and trying to infer if they're "healthy" for scaling or not) seems like a very involved and nuanced problem that may be difficult to generalize. The second over-provisioning method (by scaling down trigger values) is achievable today. While more manual, it does provide reliability at the desired level of overprovisioning. This could potentially be made more formalized by having an optional multi-cluster scaling section on triggers that allows configuration of the max participating clusters and desired resiliency (4 and 1 from the example above) and performing the above scaling for each cluster's HPA.
I tend to agree that this is more of a scheduling problem rather than autoscaling problem. We are working on CloudEvent (#479) support that should give insights on how apps are autoscaling but we can evaluate if we can add more events in the list that could be helpful in this scenario.
However, most probably it will be related to scheduling again which is the same question - Is this up to KEDA or not? It depends.
Any plans to PoC CloudEvents for multi-cloud support?
Not at the moment since CloudEvents is still being added but curious to hear what events you would like to have.
@tomkerkhove, check
Check for what? :)
Checking the status
Nothing was posted here so it's safe to assume no changes. I think we are still looking for solid use-cases and needs.
CloudEvent support is tracked in dedicated issues.
I have a set of 20 or so microservices that are replicated across three OpenShift clusters, one cluster for each availability zone. Each replicated microservice deployment has its own kafka topic. All three clusters/AZs pull from a single kafka cluster. I would like for the required pod count calculated by the Apache Kafka Scaler to be spread evenly across the currently up clusters/AZs.
Is this possible today? I've seen several fits and starts in this regard but they all seem to fade away.
Just found Karmada. How do I combine FederatedHPA with KEDA Apache Kafka Scaler?
Just found Karmada. How do I combine FederatedHPA with KEDA Apache Kafka Scaler?
Did you find any way?
Support workload expansion on multiple clusters
Use-Case
HTTP/gRPC based workloads have first class support for multi-cluster expansion by several multi-cluster ingress controllers. But queue workers are more challenging for multi-cluster scheduling. The list of use-cases when multi-cluster is useful is as following:
Specification
it would be great to have a prototype for Kubernetes Federation that enables ScaledObjects/Jobs integration with FederatedDeployment/Jobs