ovh / public-cloud-roadmap

Agile roadmap for OVHcloud Public Cloud services. Discover the features our product teams are working on, comment and influence our backlog.
https://www.ovhcloud.com/en/public-cloud/
187 stars 5 forks source link

Multi-AZs clusters #22

Open mhurtrel opened 4 years ago

mhurtrel commented 4 years ago

As a MKS administrator I want to spawn Kubernetes cluster distributed on multiple low-latency availability zones So that I can spread worker nodes accros regions and benfit from an ever better HA of my K8S control plane with contractual SLA

Note : We currently target this in France at first.

cambierr commented 3 years ago

We already forked the Openstack CCM to implement multi region at OVH and running it on prod. Feel free to ask if interested

mr-ssd commented 3 years ago

We already forked the Openstack CCM to implement multi region at OVH and running it on prod. Feel free to ask if interested

@cambierr I'm looking for that solution. I would appreciate it if you could share.

cambierr commented 3 years ago

It's available at https://hub.docker.com/repository/docker/alphanetworkstv/openstack-cloud-controller-manager-amd64 The only change compared to the uplink version is that you need to provide allowed regions to the config:

[Global]
username=...
password=...
auth-url=https://auth.cloud.ovh.net/v3
tenant-id=...
domain-id=default
region=GRA5
regions=GRA3,GRA5,GRA7

[Networking]
internal-network-name=...
ipv6-support-disabled=true
public-network-name=Ext-Net

[BlockStorage]
bs-version = v3

I still need to push the code somewhere to share the sources, by the way.

CSI is also available: https://hub.docker.com/repository/docker/alphanetworkstv/cinder-csi-plugin-amd64

tanandy commented 3 years ago

can we have something integrated in the console and easily deploy node pool on different region ?

zcourts commented 3 years ago

@mhurtrel has there been any movement on this? We're desperately in need of it because we're being affected by http://travaux.ovh.net/?do=details&id=50121& that is dependent on some upstream OpenStack fix - we recently had a 14hr outage because the OVH Volume wouldn't re-attach to any of our pods after a deployment. A multi-region cluster would avoid this.

@cambierr this looks interesting. I am not sure how to use it but I assume will need to use the OpenStack client to setup? I'll see if I can get some help from an ops eng. in the meantime can you provide any resources/guide on how to use these? I'd like to setup two test clusters to play with it.

cambierr commented 3 years ago

@zcourts what I built is a version of the https://github.com/kubernetes/cloud-provider-openstack that supports multiple region cluster. This is not an extension of the managed clusters by OVH.

If you are still interested, then you can use your own cluster created with kubeadm, rke, or whatever tool you want, then deploy the Openstack cloud controller in your cluster.

Basically, you can do the exact same thing as per https://github.com/kubernetes/cloud-provider-openstack except:

based on that, the CCM will query all the provided regions for instance data instead of the default one only. The CSI (the "kubernetes/openstack volume translator) will also work this way and be able to deal with volumes in the "good" region for the instances.

Please be aware that volumes from region A won't be able to be mounted on region B !

Feel free to ask if you need any help.

tanandy commented 3 years ago

@mhurtrel has there been any movement on this? We're desperately in need of it because we're being affected by http://travaux.ovh.net/?do=details&id=50121& that is dependent on some upstream OpenStack fix - we recently had a 14hr outage because the OVH Volume wouldn't re-attach to any of our pods after a deployment. A multi-region cluster would avoid this.

@cambierr this looks interesting. I am not sure how to use it but I assume will need to use the OpenStack client to setup? I'll see if I can get some help from an ops eng. in the meantime can you provide any resources/guide on how to use these? I'd like to setup two test clusters to play with it.

I seen Scaleway Kosmos provides easy multi cluster integration . You can create cluster there and use ovh nodes until we have something in our OVH console.

zcourts commented 3 years ago

@tanandy thanks - I didn't know about Scaleway's Kosmos - https://www.scaleway.com/fr/betas/#kuberneteskosmos it is in private beta though so it won't be an option for our production environments right now (also requires invite to access).

@cambierr ahhhh, that's clearer now. Couple of questions come to mind:

  1. How do OVH resources (volumes etc) get provisioned, are those requested by the CCM or some sub-component and get auto added or you need to attach what you need outside k8s then use?
  2. How many control planes do you end up having? A single one or one per region?

We've been working on a design that uses ISTIO multi-cluster. In this setup we would have a control plane per region. Obvious benefit is that we can entirely lose a region and continue operation, the challenge is the increased complexity in managing and controlling access to multiple control plane/clusters.

We've not gotten as far as doing test clusters with this yet as the issue only affected us 2 weeks ago but we're progressing along this route. I'll bring your links to our team's attention for them to consider as well.

How have you found doing it this way so far? Any common/obvious issues?

cambierr commented 3 years ago

How do OVH resources (volumes etc) get provisioned, are those requested by the CCM or some sub-component and get auto added or you need to attach what you need outside k8s then use? The CCM and CSI are responsible to provision, as per the "official" providers. The scheduler will allocate resources on a node and the CSI will discuss with the node's openstack cluster to be able to provision volumes as needed, for instance

How many control planes do you end up having? A single one or one per region? A single one, this is not multi kubernetes cluster stuff bug a single one on top of multiple Openstack regions

Si, in our setup we use regions from GRA, UK, and SBG in our cluster. This since then a multi region Kubernetes cluster with "ultra high HA" given three regions, all with their own infrastructures (power, net). This brings us the benefit of the HA without the complexity of federation.

tanandy commented 2 years ago

Why do we close this issue ?

mhurtrel commented 2 years ago

Hi @tanandy this was a mistake, I confirm we will work on this at a later stage

Grounz commented 2 years ago

Hi,

what's the status of this issue ?

qualitesys commented 2 years ago

Hi OVH Is there some news on the multi region cluster? As said by @zcourts, this option exists with Scaleway Kosmos, works very well. Have you some schedule on the roadmap ?

mhurtrel commented 2 years ago

Hi @Grounz and @qualitesys I confirm that we will develop a solution for this, but I can't yet share you a public ETA. We are exploring option for a very rich multiregion, multicloud and multicluster experience. I will update this issue when possible.

lenglet-k commented 2 years ago

Hi @mhurtrel

have you any news ?

mhurtrel commented 2 years ago

Our current ETA is early 2023

botylev commented 1 year ago

Hi @mhurtrel, any news on this feature?

mhurtrel commented 1 year ago

Hello @Spark3757 there as been a small delai on our IaaS pillars availaibilities in RBX, that is needed to fully validate our plans. But I should be able to give a new ETA soon. Sorry for the delay.

yctn commented 1 year ago

Hi @mhurtrel, any news on this feature?

mhurtrel commented 1 year ago

Unfortunately, we are not yet able to share an ETA, though it remains a priority. As soon as we have ETA from our IAAS colleagues dependancies, I will update this issue.

mhurtrel commented 1 year ago

A small update on the matter : Multi-regions clusters will not be provided in the foreseeable future in Managed Kubernetes Services but though a new product offering capability to manage self-managed Kubernetes control planes by bringing your own nodes.

I refocused this issue on multi-AZ clusters, which will be offered in our multi-AZ regions, the first one being planned in France. We cannot give you an ETA yet but be assured it is identified as a priority.

lenglet-k commented 1 year ago

Hello @mhurtrel

Could we manage a multi-AZ cluster in different infrastructures like a PCI / HPC / HPC Secnumcloud mix?

What do you mean by "the ability to manage control planes ourselves", does that mean that we will be able to add control planes and manage their configurations and updates? Will it be secnumcloud compatible?

mhurtrel commented 1 year ago

Hi @lenglet-k

This issue (#22) will focus on MultiAZ (single region) Managed Kubernetes service (leveraging Public Cloud instances only). We will however also offer a multicloud/multicluster solution (in private beta in the next few months) : to build and manage self-amanged cluster on any infrastructure : https://github.com/ovh/public-cloud-roadmap/issues/467 . This one will at first require the infrastructure to offer internet connectivity, but will at a later stage support vrack-only connectivity. Yes you will be able to manage the control plane, using a supported distribution (more details soon). it will not be SecNumCloud compatible at launch.

mhurtrel commented 1 year ago

Hi everyone ! Though we of course still plan to support 3AZ-regions-based managed Kubernetes clusters, I also wanted to let you know that we just released Managed Rancher Service in alpha (aka private beta). Amongst many other features, this product anables you to create and self-managed cluster based on any infrastructure. You could for example spawn baremetal machines or VMs in multiple regions (from OVHcloud, another cloud providers or even onprem, provided the machines have internet access) to build an extremely-highly-available cluster.

Do not hesitate to consult this page to learn more and fill the short form to be one of the first users of this new managed service : https://labs.ovhcloud.com/en/managed-rancher-service/

cambierr commented 1 year ago

Rancher ? ouch !

Will the multi region MKS be based on it ?

mhurtrel commented 1 year ago

@cambierr nope, MKS and Manage Rancher Service are twi different products. Multi-zone MKS will me made available quickly after the first multi az ovhcloud public cloud region is made available and will not require managed rancher service.

salimidruide commented 8 months ago

@mhurtrel thank you for the update. I would like to give you an honest feedback, if you want like to keep your clients you need to move from the speech of " we of course still plan to support 3AZ-regions-based managed Kubernetes clusters" to "This is the deadline for delivering and we are respecting it".

MatthieuFin commented 2 months ago

Hello there,

I discover this thread, I encounter this issue couple month earlier. I followed same approach as @cambierr, and made first of all a PR on cinder-csi-plugin part of openstack cloud provider.

This was merge this summer and should be released soon i guess, but you could build you own docker image to try it before release.

I enjoy any feedback on this implementation.

Technically it is a multi-cloud implementation and not only multi region. You could spread nodes on multiple Openstack cluster.

Which permit us to spread a single Kubernetes cluster on 3 OVH regions and an on-premise Openstack cluster and being able to consume PVC on any nodes.

Obviously the limitation is that PVC created on a region are not consumable from another region and you have to manage a storage class per region.

Personally to spread a single STS across multiple region I pre-provision PVC before create STS, (with one PVC per SC I'm able to spread my pods across my different regions)

I don't use MKS product (for a lot of some reasons). I don't know how it is compatible with this product, technically it should be doable.

Next step on my side is to test current implementation of CCM in this multi-cloud environment to be able to keep CCM with a multi-cloud cluster.