zhaifg / mysql-master-ha

Automatically exported from code.google.com/p/mysql-master-ha
0 stars 0 forks source link

check_ssh is wrongly claiming ssh tests are failing (user == root) #31

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
> What steps will reproduce the problem?
1. Configure passwordless ssh access on all servers with user 'root'
2. Use virtual IP on current master as server1
3. Run masterha_check_ssh on manager host

> What is the expected output? What do you see instead?

All ssh checks work manually .. so, the script is expected to confirm this .. 
but instead I get the output below:

Wed Jul 18 08:50:38 2012 - [warning] Global configuration file 
/etc/masterha_default.cnf not found. Skipping.
Wed Jul 18 08:50:38 2012 - [info] Reading application default configurations 
from /etc/app1.cnf..
Wed Jul 18 08:50:38 2012 - [info] Reading server configurations from 
/etc/app1.cnf..
Wed Jul 18 08:50:38 2012 - [info] Starting SSH connection tests..
Wed Jul 18 08:50:38 2012 - [debug] 
Wed Jul 18 08:50:38 2012 - [debug]  Connecting via SSH from 
root@10.0.0.50(10.0.0.50:22) to root@10.0.0.14(10.0.0.14:22)..
Wed Jul 18 08:50:38 2012 - [debug]   ok.
Wed Jul 18 08:50:38 2012 - [debug]  Connecting via SSH from 
root@10.0.0.50(10.0.0.50:22) to root@10.0.0.12(10.0.0.12:22)..
Wed Jul 18 08:50:38 2012 - [debug]   ok.
Wed Jul 18 08:50:39 2012 - [debug] 
Wed Jul 18 08:50:38 2012 - [debug]  Connecting via SSH from 
root@10.0.0.14(10.0.0.14:22) to root@10.0.0.50(10.0.0.50:22)..
Wed Jul 18 08:50:38 2012 - [debug]   ok.
Wed Jul 18 08:50:38 2012 - [debug]  Connecting via SSH from 
root@10.0.0.14(10.0.0.14:22) to root@10.0.0.12(10.0.0.12:22)..
Wed Jul 18 08:50:39 2012 - [debug]   ok.
Wed Jul 18 08:50:39 2012 - [error][/usr/share/perl5/MHA/SSHCheck.pm, ln63] 
Wed Jul 18 08:50:39 2012 - [debug]  Connecting via SSH from 
root@10.0.0.12(10.0.0.12:22) to root@10.0.0.50(10.0.0.50:22)..
Permission denied (publickey,password).
Wed Jul 18 08:50:39 2012 - [error][/usr/share/perl5/MHA/SSHCheck.pm, ln107] SSH 
connection from root@10.0.0.12(10.0.0.12:22) to root@10.0.0.50(10.0.0.50:22) 
failed!
SSH Configuration Check Failed!
 at /usr/bin/masterha_check_ssh line 44

(now, the following is a manual test immediately after the failed run)#

root@staging:~# ssh -b 10.0.0.12 -l root 10.0.0.50
Linux live1 2.6.32-5-amd64 #1 SMP Thu Mar 22 17:26:33 UTC 2012 x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
You have new mail.
Last login: Wed Jul 18 14:03:09 2012 from staging

> What version of the product are you using? On what operating system?
I am using masterha 0.53 on debian 6

> Please provide any additional information below.
I strace'd the run, and peeked in the logs and it seems this is something to do 
with the temp file created in $workdir for each check .. but the script also 
removes these temp files, so we cannot ascertain why or what in the log for 
this check is making the script report it as a failed connection attempt)

Original issue reported on code.google.com by chuxuzo...@gmail.com on 18 Jul 2012 at 1:06

GoogleCodeExporter commented 9 years ago
I get the same problem, but for all hosts throughout my cluster. I'm using a 
non-root user. The manual checks from any host to any host succeed, but MHA 
checks fail:

--------------------------------------------------------------
Mon Aug 27 18:30:22 2012 - [debug]  Connecting via SSH from 
ubuntu@10.248.86.135(10.248.86.135:22) to 
ubuntu@10.248.109.158(10.248.109.158:22)..
Permission denied (publickey).
Mon Aug 27 18:30:23 2012 - [error][/usr/share/perl5/MHA/SSHCheck.pm, ln107] SSH 
connection from ubuntu@10.248.86.135(10.248.86.135:22) to 
ubuntu@10.248.109.158(10.248.109.158:22) failed!
Mon Aug 27 18:30:24 2012 - [error][/usr/share/perl5/MHA/SSHCheck.pm, ln63] 
Mon Aug 27 18:30:22 2012 - [debug]  Connecting via SSH from 
ubuntu@10.248.109.158(10.248.109.158:22) to 
ubuntu@10.248.86.135(10.248.86.135:22)..
Permission denied (publickey).
Mon Aug 27 18:30:24 2012 - [error][/usr/share/perl5/MHA/SSHCheck.pm, ln107] SSH 
connection from ubuntu@10.248.109.158(10.248.109.158:22) to 
ubuntu@10.248.86.135(10.248.86.135:22) failed!
Mon Aug 27 18:30:25 2012 - [error][/usr/share/perl5/MHA/SSHCheck.pm, ln63] 
--------------------------------------------------------------

And, for example:

--------------------------------------------------------------
ubuntu@ip-10-248-109-158:~$ ssh ubuntu@10.248.86.135
Welcome to Ubuntu 12.04 LTS (GNU/Linux 3.2.0-25-virtual x86_64)
...(other successful login text snipped)...
--------------------------------------------------------------

I'm perplexed. I'll be digging in as well, and if I solve it, I'll post back 
here.

Original comment by timeless...@gmail.com on 27 Aug 2012 at 6:53

GoogleCodeExporter commented 9 years ago
Timeless...,

I think it is safe to suppose that mha is not ready for open source users .. 
The author probably would prefer that you approach his commercial company and 
purchase support to help with this.

I moved on, and suggest you do too .. percona extradb cluster (with galera) is 
a very neat solution and is still opensource but with commercial support .. 
google for it, and move onto that .. It offers you master-slave if that is what 
you want, but it provides a better solution through clusters of n-nodes ... 

Good luck .. 

Original comment by chuxuzo...@gmail.com on 27 Aug 2012 at 9:56

GoogleCodeExporter commented 9 years ago
Does the error repeat every time? Or do you sometimes succeed and sometimes get 
errors? If you succeed sometimes, would you please try increasing MaxStartups 
parameter in all sshd_config, restart sshd and try again?

Original comment by Yoshinor...@gmail.com on 27 Aug 2012 at 10:27

GoogleCodeExporter commented 9 years ago
This seems to be exactly the same issue as I posted about here: 
http://code.google.com/p/mysql-master-ha/issues/detail?id=13#c9 - I needed to 
have my SSH key named "id_dsa" and then it worked.

Original comment by timeless...@gmail.com on 28 Aug 2012 at 12:17

GoogleCodeExporter commented 9 years ago
Hello,

I had the same error. 

The solution is :
each server should have their pub key in their authorized_keys.

Original comment by christop...@gmail.com on 30 Apr 2013 at 9:15