hibari / clus

Cluster is a simple tool for installing, configuring, and bootstrapping a cluster of nodes - primarily Hibari nodes.
Other
5 stars 4 forks source link

ssh-copy-id fail (?) #3

Open dewd opened 12 years ago

dewd commented 12 years ago

Probably a user or docs issue...

context: setting up a three node cluster on Centos 5.7 have "admin" / root accounts configured for password-less SSH to each node from an admin node (admin-vm) node names are simple (vm1 vm2 vm3).
network connectivity verified between all four nodes

attempting to set up the runtime user ("hibari") with ./clus/priv.clus.sh

running clus.sh from "admin" account on a the admin node, successfully completes the init functions using SSH with the "admin" account:

I can verify each of the above steps on the target node. It then it appears to complete the ssh-copy-id, after which it successfully locks the account (using the "admin" creds)

but fails on the test (where it does SSH $USER_NODE@$HOST_NODE echo $USER_NODE@$HOST_NODE)

I suspect that I have missed a step in the setup relating to configuration of the runtime user ("hibari"), but having re-read the dev guide (setting up a cluster) & the readme, I'm stumped as to what it is (ssh-agent or .ssh/config ?)

norton commented 12 years ago

Hmm ... not sure offhand.

Can you try running with the bash -x option?

$ bash -x ./clus/priv.clus.sh .....

Check for the failing command and then run it manually.

$ bash -x ssh-copy-id ....

Also check the permissions of your .ssh directories on the install node and the target nodes.

dewd commented 12 years ago

ok -- solved. ssh-agent not running.

for diagnostic purposes...

perms on ~/.ssh for installing user ($USER="admin") & root are 700, for ~/.ssh/id_rsa are 600 & ~/.ssh/id_rsa.pub are 644 password-less ssh definitely works for the installer user on all nodes.

running just the ssh-copy-id (in bash -x) I see that ssh-copy-id is examining the identity files in the installer user ~/.ssh. --> "ERROR: No identities found"

ergo, my setup of ssh-agent is incomplete

So, logged in as the installing user ("admin") and in the home directory for that account running ssh-add -L <--- no connection to ssh-agent eval ssh-agent <--- get a pid ssh-add <--- adds the identity for the installing user

et voila, the init command runs successfully to completion.

learning: step #2 in Setting Up Your User Privileges (section 2.6 Installing a Multi-Node Hibari Cluster), works as advertised, as long as ssh-agent is primed just before set #2 in Installing Hibari (same chapter), or set to autorun on reboot or login. I restarted a few times, didn't have ssh-agent in either /etc/profile or ~/.bash_profile to ensure it was loaded.