Ability to add Equinix Metal machines to clusters based on a different infra provider

kubernetes-sigs / cluster-api-provider-packet

Cluster API Provider Packet (now Equinix Metal)

https://deploy.equinix.com/labs/cluster-api-provider-packet/

Apache License 2.0

100 stars 42 forks source link

Ability to add Equinix Metal machines to clusters based on a different infra provider #265

Open andrewrynhard opened 3 years ago

andrewrynhard commented 3 years ago

User Story

As a user I would like to manage packet machines with cluster-api-provider-packer, but join them to a control plane that is not in Equinix Metal. The idea is that I can use Sidero for bare metal, and then burst out to Equinix Metal when I need to.

Detailed Description

I want to create a MachineDeployment for PacketMachines, and have them join a non-PacketCluster cluster:

2021-07-07T18:26:21.403Z    DEBUG   controller-runtime.controller   Successfully Reconciled {"controller": "packetmachine", "name": "dc-general-amd64-b7jxp", "namespace": "default"}
2021-07-07T18:26:21.431Z    INFO    controllers.PacketMachine.infrastructure.cluster.x-k8s.io/v1alpha3  PacketCluster is not available yet  {"packetmachine": "default/dc-general-amd64-b7jxp", "machine": "dc-general-amd64-6fc64f5846-ntxn4", "cluster": "atl2"}

The atl2 cluster reference is a Sidero based cluster.

/kind feature

detiber commented 3 years ago

Ah, interesting. I don't see any reason we shouldn't support this use case.

I think the biggest blocker right now is that we only allow configuration of the ProjectID on the PacketCluster, if we allowed for overriding it on the PacketMachine, then I think we could successfully decouple the need for a PacketCluster.

andrewrynhard commented 2 years ago

Ah, interesting. I don't see any reason we shouldn't support this use case.

I think the biggest blocker right now is that we only allow configuration of the ProjectID on the PacketCluster, if we allowed for overriding it on the PacketMachine, then I think we could successfully decouple the need for a PacketCluster.

Sounds reasonable to me!

andrewrynhard commented 2 years ago

@detiber Any idea when this could land? We could help and contribute the work if that helps. I would love to show this off at Kubecon!

detiber commented 2 years ago

@andrewrynhard biggest blocker right now is getting https://github.com/kubernetes-sigs/cluster-api-provider-packet/pull/269 across the line and wrapped up, which has been fighting me with continued edge cases creeping up and I'd like to avoid having to try to rebase additional changes in complicating it further.

More than happy to accept PRs for the changes needed against my fork/branch to include with that PR if you don't want to wait for it to merge first, though.

andrewrynhard commented 2 years ago

@andrewrynhard biggest blocker right now is getting #269 across the line and wrapped up, which has been fighting me with continued edge cases creeping up and I'd like to avoid having to try to rebase additional changes in complicating it further.

More than happy to accept PRs for the changes needed against my fork/branch to include with that PR if you don't want to wait for it to merge first, though.

We are working on this migration as well. No particular rush on our side to make this PR with the next couple weeks. We can wait. Thanks!

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

hansbogert commented 2 years ago

isn't the requested functionality an explicit non-goal [1] of the cluster-api? Won't there be fundamental cluster-api impossibilities? Or am I misunderstanding the requested functionality?

[1]: https://github.com/kubernetes-sigs/cluster-api/blob/46df46a9b6d3efe44d88148105fe63380ac531bd/docs/scope-and-objectives.md -- "non-goal: To manage a single cluster spanning multiple infrastructure providers."

detiber commented 2 years ago

/lifecycle frozen

detiber commented 2 years ago

isn't the requested functionality an explicit non-goal [1] of the cluster-api? Won't there be fundamental cluster-api impossibilities? Or am I misunderstanding the requested functionality?

[1]: https://github.com/kubernetes-sigs/cluster-api/blob/46df46a9b6d3efe44d88148105fe63380ac531bd/docs/scope-and-objectives.md -- "non-goal: To manage a single cluster spanning multiple infrastructure providers."

It's complicated. It has been discussed upstream in various ways. While it's not something that the community would necessarily recommend without caveats, things should not be so tightly coupled that it should prevent one from doing so.

For example, consider the case of a pod-based control plane, that infrastructure provider would be quite different than the infrastructure provider one would use for the worker nodes.

displague commented 1 year ago

There is a similar ask in the Tinkerbell and Rancher community Slacks.

https://cloud-native.slack.com/archives/C01SRB41GMT/p1678838708369249?thread_ts=1678838708.369249&cid=C01SRB41GMT

Is it on the CAP\<provider> roadmap to support provisioning of hybrid clusters where say control plane nodes are vms provisioned via vsphere and worker nodes are bare metal provisioned via tinkerbell?

@richardcase:

this isn't on the \<provider> roadmap specifically. However, the ability to create mixed-provider clusters is being discussed more generally within CAPI and a feature group has been formed. The goal is that this will be supported at some point in the future across different providers.

https://mobile.twitter.com/fruit_case/status/1555554512529653761?s=61&t=zAZ8pZ78GnN59Qr_4a0cXA

richardcase commented 1 year ago

There are changes that could be made (i.e. removing the direct coupling between the machine reconcilers and a specific infra provider for the cluster) to enable this before the feature group makes recommendations/changes. This what we did in our demo to the capmvm and capbyoh providers...but we didn't upstream these changes.