lae / ansible-role-proxmox

IaC for Proxmox VE clusters.
MIT License
496 stars 144 forks source link

SSH port must be 22 #221

Closed blake-hamm closed 1 year ago

blake-hamm commented 1 year ago

Based on proxmox requirements, this role will not build a cluster with a pve_ssh_port other than 22. When I try to run with a different port, I get hung up on Add node to Proxmox cluster in the pve_add_node.yml task. It runs indefinitely and never finishes this step. When I change to port 22, it completes without any issues.

I don't believe this is a limitation of this role, but rather a limitation of proxmox clusters. I would recommend removing the pve_ssh_port option to prevent unnecessary troubleshooting for individuals (like myself) who try to use it.

lae commented 1 year ago

When it gets stuck at the Add node to Proxmox cluster step (after the SSH configuration steps), are you able to SSH without issue to the other node on the alternative port?

As far as I know, PVE just wraps around the ssh client in its perl scripts. Port 22 does not appear to be hardcoded in any fashion there (grep -R 22 /usr/share/perl5/PVE). So I believe this should still work....

lae commented 1 year ago

Took a while to test since I kept running out of disk space, but I was able to reproduce this issue.

pvecm add was stalling due to waiting for user input...:

root@pve-2:~# /usr/bin/pvecm add 192.168.121.160 -use_ssh -link0 192.168.121.162
The authenticity of host '[192.168.121.160]:4649 ([192.168.121.160]:4649)' can't be established.
ECDSA key fingerprint is SHA256:MrXCXnk284A/qlmgzyGpLkIAzhwqq5mi1WaMNgLFakE.
Are you sure you want to continue connecting (yes/no/[fingerprint])? 

Turns out, the syntax for the known_hosts file expects a port number if it's not the default. So this:

root@pve-2:~# cat .ssh/known_hosts
# BEGIN: cluster host key for joining
pve-3,pve-3,192.168.121.160 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIHYafICQkI1MZ69qRFodeEPPrtVujFKVq2yOGZYhVA4J
# END: cluster host key for joining

into this:

root@pve-2:~# cat .ssh/known_hosts
# BEGIN: cluster host key for joining
[pve-3]:4649,[pve-3]:4649,[192.168.121.160]:4649 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIHYafICQkI1MZ69qRFodeEPPrtVujFKVq2yOGZYhVA4J
# END: cluster host key for joining

I pushed a fix for this in 4ccc9a60ccbb0a61229c3a3cda408dd01b05df0b. Would you be able to test if this resolves your issue?

lae commented 1 year ago

Apparently, specifying port 22 in known_hosts doesn't work either so I needed to add some more logic in 04f0b365a5d6c51fc0ba312fb9eba5feadcbbbe2.

blake-hamm commented 1 year ago

Sorry I've been unhelpful on this! I'm glad you were able to re-produce. I've been using geerlingguy's security role as well so I've been trying to verify if that's part of my issue..