cncf / cluster

🖥🖥🖥🖥CNCF Community Cluster
https://cncf.io/cluster
153 stars 42 forks source link

Request access for kubespray #77

Closed riverzhang closed 11 months ago

riverzhang commented 5 years ago

If you are interested in filing a request for access to the CNCF CIL, please fill out the details below.

If you are just filing an issue, ignore/delete those fields and file your issue.

First Name

River

Last Name

Zhang

Email

rongzhang@alauda.io

Company/Organization

Alauda

Job Title

Software Engineer

Project Title

Kubespray

Briefly describe the project

Kubespray (formerly known as Kargo), a subproject of Kubernetes, is a tool for easily deploying production-ready clusters. It offers lots of options, including high availability, and support for multiple platforms. http://kubespray.io/

Which members of the CNCF community and/or end-users would benefit from your work?

Kubespray and Kubernetes users

Is the code that you’re going to run 100% open source? If so, what is the URL or URLs where it is located?

yes 100% open source. https://github.com/kubernetes-incubator/kubespray

What kind of machines and how many do you expect to use (see: https://www.packet.net/bare-metal/)?

t1.small.x86 -1 x1.small.x86 -4

What OS and networking are you planning to use (see: https://help.packet.net/technical/infrastructure/supported-operating-systems)?

Debian 8 CentOS/RHEL 7 OpenSuse 14.03 Ubuntu 16.04 LTS Container Linux by CoreOS Fedora/CentOS Atomic

Please state your contributions to the open source community and any other relevant initiatives

dankohn commented 5 years ago

+1

-- Dan Kohn dan@linuxfoundation.org Executive Director, Cloud Native Computing Foundation https://www.cncf.io +1-415-233-1000 https://www.dankohn.com

On Sun, Aug 5, 2018 at 6:23 AM, Rong Zhang notifications@github.com wrote:

If you are interested in filing a request for access to the CNCF CIL, please fill out the details below.

If you are just filing an issue, ignore/delete those fields and file your issue. First Name

River Last Name

Zhang Email

rongzhang@alauda.io Company/Organization

Alauda Job Title

Software Engineer Project Title

Kubespray Briefly describe the project

Kubespray (formerly known as Kargo), a subproject of Kubernetes, is a tool for easily deploying production-ready clusters. It offers lots of options, including high availability, and support for multiple platforms. http://kubespray.io/ Which members of the CNCF community and/or end-users would benefit from your work?

Kubespray and Kubernetes users Is the code that you’re going to run 100% open source? If so, what is the URL or URLs where it is located?

yes 100% open source. https://github.com/kubernetes-incubator/kubespray What kind of machines and how many do you expect to use (see: https://www.packet.net/bare-metal/)?

t1.small.x86 -1 x1.small.x86 -4 What OS and networking are you planning to use (see: https://help.packet.net/technical/infrastructure/ supported-operating-systems)?

Debian 8 CentOS/RHEL 7 OpenSuse 14.03 Ubuntu 16.04 LTS Container Linux by CoreOS Fedora/CentOS Atomic Please state your contributions to the open source community and any other relevant initiatives

  • Switch to kubeadm deployment as the default method.
  • Support more Provisioning and cloud providers(GCE, AWS, Openstack, Digital Ocean, Azure,vSphere).
  • Kubespray API
    • Perform all actions through an API.
    • Store inventories / configurations of mulltiple clusters .
    • make sure that state of cluster is completely saved in no more than one config file beyond hosts inventory

How will this testing advance cloud native computing (specifically containerization, orchestration, microservices or some combination).

Kubespray is a Kubernetes incubator project that can be cluster created, configured and managed . It provides optional, additive functionality on top of core Kubernetes. Any other relevant details we should know about while preparing the infrastructure?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cncf/cluster/issues/77, or mute the thread https://github.com/notifications/unsubscribe-auth/AC8MBgR5GbTLJVTVcwB71aWfEyIkH3Vpks5uNse-gaJpZM4VvUnn .

taylorwaggoner commented 5 years ago

@riverzhang, I have sent you an invitation email from Packet. Please let us know if you have any questions!

riverzhang commented 5 years ago

Thanks a lot @dankohn @taylorwaggoner

Miouge1 commented 5 years ago

@taylorwaggoner how do we go about adding a couple of c1.small.x86 into this request to increase the CI capacity?

vielmetti commented 5 years ago

Hi @Miouge1 - EWR1 is a good place to deploy c1.small.x86 systems currently based on available inventory.

vielmetti commented 3 years ago

@Miouge1 -

Here's a current look at your inventory on this project:

Project,Hostname,Facility,Plan Name,Created at,Created by
Kubespray,runner-pu4ihlzh-c3-small-1618257951-32c015a3,nrt1,c3.medium.x86,2021-04-12T20:05:56Z,Maxime Guyot
Kubespray,runner-pu4ihlzh-c3-small-1618252113-969683af,nrt1,c3.medium.x86,2021-04-12T18:28:38Z,Maxime Guyot
Kubespray,runner-pu4ihlzh-c3-small-1617285584-19a3ff6e,nrt1,c3.medium.x86,2021-04-01T13:59:48Z,Maxime Guyot
Kubespray,runner-pu4ihlzh-c3-small-1617285583-6933641d,nrt1,c3.medium.x86,2021-04-01T13:59:47Z,Maxime Guyot
Kubespray,runner-pu4ihlzh-c3-small-1615246150-fb7b91db,fr2,c3.medium.x86,2021-03-08T23:30:06Z,Maxime Guyot
Kubespray,runner-pu4ihlzh-c3-small-1615245067-01dbb170,fr2,c3.medium.x86,2021-03-08T23:11:23Z,Maxime Guyot
Kubespray,runner-pu4ihlzh-c3-small-1615244542-a745caf7,fr2,c3.medium.x86,2021-03-08T23:05:59Z,Maxime Guyot
Kubespray,runner-pu4ihlzh-c3-small-1615243721-877152f3,fr2,c3.medium.x86,2021-03-08T22:48:58Z,Maxime Guyot
Kubespray,runner-pu4ihlzh-c3-small-1615243606-0c9e9fa5,fr2,c3.medium.x86,2021-03-08T22:47:50Z,Maxime Guyot
Kubespray,runner-pu4ihlzh-c3-small-1615242282-0c424c3b,fr2,c3.medium.x86,2021-03-08T22:26:20Z,Maxime Guyot
Kubespray,runner-pu4ihlzh-c3-small-1615241569-902d7691,fr2,c3.medium.x86,2021-03-08T22:13:34Z,Maxime Guyot
Kubespray,runner-pu4ihlzh-c3-small-1615239107-0bd42c85,fr2,c3.medium.x86,2021-03-08T21:33:02Z,Maxime Guyot
Kubespray,docker-machine-runner,ams1,t1.small.x86,2020-07-17T09:37:28Z,Maxime Guyot
Kubespray,node06,ams1,m1.xlarge.x86,2019-06-11T09:07:14Z,Andreas Krüger
Kubespray,master02,ams1,c1.small.x86,2019-04-17T20:07:36Z,Andreas Krüger
Kubespray,node03,ams1,x1.small.x86,2019-01-09T08:59:25Z,Andreas Krüger
Kubespray,node02,ams1,x1.small.x86,2019-01-09T08:59:06Z,Andreas Krüger

This is quite a few more machines than was originally described - can you take a look and see if they are all in active use, and destroy any that are idle? Thanks!

cc @idvoretskyi @caniszczyk

Miouge1 commented 3 years ago

Thank you for catching this @vielmetti !

The runner-* spot instances are supposed to short lived (gitlab auto scaling runners with packet/equinix-metal docker-machine provider). It looks like sometimes there is a problem in the deletion/cleanup process:

Apr 12 21:08:04 localhost gitlab-runner[661212]: WARNING: This action will delete both local reference and remote instance.  name=runner-pu4ihlzh-c3-small-1618252113-969683af operation=remove
Apr 12 21:08:12 localhost gitlab-runner[661212]: Successfully removed runner-pu4ihlzh-c3-small-1618252113-969683af  name=runner-pu4ihlzh-c3-small-1618252113-969683af operation=remove

I think we can use the TerminationTime API option to set the max lifetime of the instances.

I'm creating a GitHub issue to track this with more details: https://github.com/kubernetes-sigs/kubespray/issues/7501

node02 and node03 were already with "SchedulingDisabled", we had a task to remove them completely but kind of forgot about it. I removed them from the cluster and from the packet/equinix-metal portal.

vielmetti commented 3 years ago

Thanks @Miouge1 ! All set for now here, I'll close this and watch the issue upstream.

vielmetti commented 3 years ago

Upstream issue is https://github.com/equinix/docker-machine-driver-metal/pull/59 to add a TTL parameter so that machines can be created with short expected lifetimes.

vielmetti commented 2 years ago

Hello @riverzhang and @Miouge1 -

The project "kubespray" currently has one host deployed to our hkg1 data center, which is being taken offline.

If you could redeploy that system to a different data center that would be appreciated - I can help with specific locations if that's useful. It's likely that the nearest Asia data center with availability would be in Tokyo.

riverzhang commented 2 years ago

@vielmetti I will start the migration soon. Thanks!

github-actions[bot] commented 2 years ago

Checking if there are any updates on this issue

vielmetti commented 2 years ago

@riverzhang @Miouge1 -

Remember when I said that Tokyo would be a good place to deploy a server? It was, for about 10 months. Now we need that server back to fulfill a contract.

https://github.com/cncf/cluster/issues/77#issuecomment-889973827

I'll share guidance on alternatives - if you can do a migration of that system promptly once I find you a suitable alternative location I would appreciate that greatly.

Also and separately, the rest of the kubespray infra is deployed in our legacy (Packet) AMS1 data center, which is also being slowly but surely turned down as well in favor of new IBX (Equinix) data centers. After the Tokyo system is moved, let's plan a time to sync up and discuss plans to rehome everything else, identify which machines would be good to use etc.

cc @jeefy @hh

Miouge1 commented 2 years ago

@vielmetti thank you for the info. We are talking about server ID a95d5686-48d7-41c6-b78e-2a67fe71b063 right?

I did not know about this server. The UI says "created by @riverzhang", it has some pods on it "karmada-runner" and some kubevirt:

[root@kube-node2 ~]# kubectl get pods
NAME                         READY   STATUS    RESTARTS   AGE
karmada-runner-j5gg8-575pj   2/2     Running   0          66d

@riverzhang can you explain and could you cleanup or migrate as requested?

vielmetti commented 2 years ago

This looks like Karmada https://github.com/karmada-io/karmada - it is a CNCF Sandbox project. I don't know much more than what it says about itself:

"Karmada (Kubernetes Armada) is a Kubernetes management system that enables you to run your cloud-native applications across multiple Kubernetes clusters and clouds, with no changes to your applications. By speaking Kubernetes-native APIs and providing advanced scheduling capabilities, Karmada enables truly open, multi-cloud Kubernetes."

cristicalin commented 2 years ago

Hello folks!

We recently hit an issue with the two nodes kubernetes cluster set up on equinix metal which we use for the kubespray CI. The situation the cluster is in right now is not easily recoverable (see https://github.com/kubernetes-sigs/kubespray/issues/8736) and we would like to ask permission to spin up a new cluster and move our CI over to the new one. This would require us to run the two setups side by side for a few days (2 weeks maybe) until we manage to get everything back up.

/cc @Miouge1 @floryut

vielmetti commented 2 years ago

Side by side setup is fine to make a transition. Please use gen3 instances (e.g. c3.small) instead of gen1 or gen2 (e.g. t1.small) for the new cluster, as mentioned above @cristicalin we are migrating away from our legacy data centers.

vielmetti commented 2 years ago

Thanks @Miouge1 .

@riverzhang - I don't know your working hours, and I am hopeful that we can get this promptly resolved.

vielmetti commented 2 years ago

The CI and server reclamation issues have been resolved, thanks @mattymo for sorting this out.

riverzhang commented 2 years ago

This looks like Karmada https://github.com/karmada-io/karmada - it is a CNCF Sandbox project. I don't know much more than what it says about itself:

"Karmada (Kubernetes Armada) is a Kubernetes management system that enables you to run your cloud-native applications across multiple Kubernetes clusters and clouds, with no changes to your applications. By speaking Kubernetes-native APIs and providing advanced scheduling capabilities, Karmada enables truly open, multi-cloud Kubernetes."

@vielmetti Do a demo in the karmada community, borrowing the service of kubespray.I'm sorry I didn't delete it in time.

Miouge1 commented 2 years ago

@riverzhang thank you for the info. Maybe a separate account for Karmada or for community events would be suitable?

vielmetti commented 1 year ago

@Miouge1 @jeefy

Reopening for more data center migration issues.

As I mentioned back in April:

Also and separately, the rest of the kubespray infra is deployed in our legacy (Packet) AMS1 data center, which is also being slowly but surely turned down as well in favor of new IBX (Equinix) data centers. After the Tokyo system is moved, let's plan a time to sync up and discuss plans to rehome everything else, identify which machines would be good to use etc.

The time has come to do this migration, as our AMS1 center is closing. We have an equivalent alternative in our AM (Amsterdam) metro. There are three systems to be migrated (a Kubernetes cluster and a Docker machine runner). I see that the vast bulk of the on-demand testing is happening in our DA metro, so if you want to put the cluster there that's fine too.

vielmetti commented 11 months ago

The migration step described in https://github.com/cncf/cluster/issues/77#issuecomment-1332395191 has completed, closing this issue again.