Open robinAwallace opened 1 year ago
Hi @robinAwallace, I think the cluster move
is not validated for BYOH
and it certainly requires agent restart to talk to the new management cluster.
All resource are moved but the byohost are not moved. Which is maybe not that strange
This might be due to the permissions on ByoHost
CRDs. Did you get any errors with clusterctl move
?
restart the byoh-agent with no success
Can you share the agent output/errors too?
Hello :slightly_smiling_face:
No there where no errors from clusterctl move. The only error I got was that the byoh controller could not find the byohosts.
But I got it to work. After doing clusterctl move I had to move the byo host objects manually from the first cluster to the new management cluster. To do this I had to delete the byoh webhook stopping you do add byoh objects.
Then I had to create new kubeconfigs with the correct cert and ip to the new control-plane. Also I had to create a new csr to validate byoh agent user. Finally I sent the new kubeconfig to the nodes at /.byoh/config
and restarted the agent. Then everything work fine :partying_face:
Awesome, there are still some UX gaps but it will be nice to have the above manual process captured in some doc. Would you like to create a PR for documentation of the steps that you have followed?
But I got it to work. After doing clusterctl move I had to move the byo host objects manually from the first cluster to the new management cluster. To do this I had to delete the byoh webhook stopping you do add byoh objects.
Then I had to create new kubeconfigs with the correct cert and ip to the new control-plane. Also I had to create a new csr to validate byoh agent user. Finally I sent the new kubeconfig to the nodes at
/.byoh/config
and restarted the agent. Then everything work fine 🥳
Tried to follow the same process, but byoh-agent got stuck with:
I0315 15:46:24.611558 10704 host_reconciler.go:91] "msg"="Machine ref not yet set"
I believe the reason for this is that Status for ByoHost is not copied to destination cluster, but because it has a AttachedByoMachineLabel byoh infrastructure controller is not setting it. Tried to delete AttachedByoMachineLabel label and restart byoh infrastructure controller, but it didn't help - byoh infrastructure controller now says that:
I0315 16:16:12.321312 1 byomachine_controller.go:270] "msg"="Attempting host reservation"
I0315 16:16:12.321493 1 byomachine_controller.go:519] "msg"="No hosts found, waiting.."
Hmm, I did not have this issue.
But yes as you say it does not copy over the ByoHosts when running the move command. So I had to copy them manually by doing a kubectl get byohosts.infrastructure.cluster.x-k8s.io -n <namesapce> <byohost> -oyaml
and save it to a file. But before I can apply it to the new management cluster I have to temporarily remove the webhook, validatingwebhookconfigurations.admissionregistration.k8s.io byoh-validating-webhook-configuration
.
I hope you get it to work :slightly_smiling_face:
Hmm, I did not have this issue.
But yes as you say it does not copy over the ByoHosts when running the move command. So I had to copy them manually by doing a
kubectl get byohosts.infrastructure.cluster.x-k8s.io -n <namesapce> <byohost> -oyaml
and save it to a file. But before I can apply it to the new management cluster I have to temporarily remove the webhook,validatingwebhookconfigurations.admissionregistration.k8s.io byoh-validating-webhook-configuration
.I hope you get it to work 🙂
I think my problem was that I skipped that part:
Also I had to create a new csr to validate byoh agent user. Finally I sent the new kubeconfig to the nodes at /.byoh/config
But I got it got work, though I had to add a little patch (nebius/cluster-api-provider-bringyourownhost#9)
This way the move process is very simple:
clusterctl move
I think I'll write some e2e tests and bring PR with it (and some documentation about move process)
What steps did you take and what happened:
Hello,
I have a BYOH cluster that I would like to move to a new management cluster using the clusterctl move command
clusterctl move --kubeconfig <byoh-management-cluster> --to-kubeconfig <new-management-cluster>
.All resource are moved but the byohost are not moved. Which is maybe not that strange. But Im not able to register the machines to the new management cluster.
I have tried to generate new bootstrap-kubeconfigs for the new management cluster and send them to the machines and restart the byoh-agent with no success.
What did you expect to happen:
After
clusterctl move
I would like to re-register the machines to the new management cluster.Anything else you would like to add:
Environment:
kubectl version --short
): v1.26.6/etc/os-release
): Ubuntu-20.04