equinix / terraform-equinix-metal-openshift-on-baremetal

OpenShift 4.9 Installer for Equinix Metal
https://registry.terraform.io/modules/equinix/openshift-on-baremetal/metal/latest
Apache License 2.0
10 stars 13 forks source link

Error copying kubeconfig locally #30

Closed displague closed 5 months ago

displague commented 5 months ago

After an otherwise successful provision, the last step of the deployment is to copy the kubeconfig for the openshift cluster locally. This step failed when run from my MBP (Sonoma 14.5)

null_resource.get_kubeconfig: Provisioning with 'local-exec'...
null_resource.get_kubeconfig (local-exec): Executing: ["/bin/sh" "-c" "mkdir -p ./auth; scp -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /Users/mjohansson/.ssh/id_rsa_mos-mx94n root@147.75.45.97:/tmp/artifacts/install/auth/* ./auth/"]
null_resource.get_kubeconfig (local-exec): /bin/sh: line 1: 23198 Killed: 9               scp -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /Users/mjohansson/.ssh/id_rsa_mos-mx94n root@147.75.45.97:/tmp/artifacts/install/auth/* ./auth/
╷
│ Error: local-exec provisioner error
│ 
│   with null_resource.get_kubeconfig,
│   on main.tf line 162, in resource "null_resource" "get_kubeconfig":
│  162:   provisioner "local-exec" {
│ 
│ Error running command 'mkdir -p ./auth; scp -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /Users/mjohansson/.ssh/id_rsa_mos-mx94n root@147.75.45.97:/tmp/artifacts/install/auth/* ./auth/': exit status 137. Output: /bin/sh: line 1: 23198 Killed: 9               scp -o
│ StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /Users/mjohansson/.ssh/id_rsa_mos-mx94n root@147.75.45.97:/tmp/artifacts/install/auth/* ./auth/
│

I tried to SSH using the same command locally from my shell, but it failed. I then used ssh-add -K as my agent was not initialized, but that did not help either. I edited the wildcard in the scp arguments and that helped:

terraform-equinix-metal-openshift-on-baremetal % mkdir -p ./auth; scp -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /Users/mjohansson/.ssh/id_rsa_mos-mx94n root@147.75.45.97:/tmp/artifacts/install/auth/* ./auth/ 
zsh: no matches found: root@147.75.45.97:/tmp/artifacts/install/auth/*
terraform-equinix-metal-openshift-on-baremetal % mkdir -p ./auth; scp -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /Users/mjohansson/.ssh/id_rsa_mos-mx94n root@147.75.45.97:/tmp/artifacts/install/auth/\* ./auth/
Warning: Permanently added '147.75.45.97' (ED25519) to the list of known hosts.
kubeadmin-password                                                                                                                                                                                                                                                 100%   23     0.2KB/s   00:00    
kubeconfig                                                                    
terraform-equinix-metal-openshift-on-baremetal % KUBECONFIG=./auth/kubeconfig kubectl get nodes
NAME                   STATUS   ROLES                  AGE    VERSION
master-0.mos.example.com   Ready    control-plane,master   100m   v1.25.16+306a47e
master-1.mos.example.com   Ready    control-plane,master   100m   v1.25.16+306a47e
master-2.mos.example.com   Ready    control-plane,master   100m   v1.25.16+306a47e
worker-0.mos.example.com   Ready    worker                 88m    v1.25.16+306a47e
worker-1.mos.example.com   Ready    worker                 85m    v1.25.16+306a47e
Terraform v1.7.4
on darwin_arm64
+ provider registry.terraform.io/cloudflare/cloudflare v4.35.0
+ provider registry.terraform.io/equinix/equinix v1.36.4
+ provider registry.terraform.io/hashicorp/aws v3.76.1
+ provider registry.terraform.io/hashicorp/external v2.3.3
+ provider registry.terraform.io/hashicorp/local v2.5.1
+ provider registry.terraform.io/hashicorp/null v3.2.2
+ provider registry.terraform.io/hashicorp/random v3.6.2
+ provider registry.terraform.io/hashicorp/tls v4.0.5
+ provider registry.terraform.io/linode/linode v2.22.0

Your version of Terraform is out of date! The latest version
is 1.8.5. You can update by downloading from https://www.terraform.io/downloads.html
displague commented 5 months ago

Adding \\ to the local-exec did not help. Replacing the command in local-exec with scp -r and removing the the *, did not help. (this command is simpler and should be used in any case)

displague commented 5 months ago

I also tried, unset SSH_AUTH_SOCK to no avail.

displague commented 5 months ago
│ Error: local-exec provisioner error
│ 
│   with null_resource.get_kubeconfig,
│   on main.tf line 163, in resource "null_resource" "get_kubeconfig":
│  163:   provisioner "local-exec" {
│ 
│ Error running command '      [[ -d ./auth ]] || mkdir -p ./auth
│       /usr/bin/scp -r -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /Users/mjohansson/.ssh/id_rsa_mos-mx94n root@147.75.45.97:/tmp/artifacts/install/auth/ ./auth/
│ ': exit status 255. Output: Warning: Permanently added '147.75.45.97' (ED25519) to the list of known hosts.
│ Load key "/Users/mjohansson/.ssh/id_rsa_mos-mx94n": invalid format
│ root@147.75.45.97: Permission denied (publickey).
│ /usr/bin/scp: Connection closed

The RSA key starts with -----BEGIN OPENSSH PRIVATE KEY-----.

displague commented 5 months ago

Changing the local SSH key storage format to pem (from openssh) resolved the problem and terraform apply completed successfully.

https://github.com/equinix/terraform-equinix-metal-openshift-on-baremetal/pull/31/commits/7d11b16e31c9a843113de961c89fef51cac8298d

It is worth noting that my other SSH keys, not created by Terraform, are in openssh format and do not have this problem.