brianshumate / ansible-jepsen

:telephone_receiver: Ansible role for Jepsen
BSD 2-Clause "Simplified" License
1 stars 1 forks source link

Auth fail #1

Closed brianshumate closed 9 years ago

brianshumate commented 9 years ago

Despite adding host keys to known_hosts for the console node, setting root user password to root, installing SSH public key to root user's authorized_keys on all test nodes, adding the SSH config, and ensuring the node naming is correct per aphyr's lxc.md, there still remains a persistent error:

lein test
...
lein test mongodb.core-test
SLF4J: The following loggers will not work becasue they were created
SLF4J: during the default configuration phase of the underlying logging system.
SLF4J: See also http://www.slf4j.org/codes.html#substituteLogger
SLF4J: clj-ssh.ssh
SLF4J: clj-ssh.ssh
SLF4J: clj-ssh.ssh
SLF4J: clj-ssh.ssh

lein test :only mongodb.core-test/document-cas-majority-test

ERROR in (document-cas-majority-test) (Session.java:512)
Uncaught exception, not in assertion.
expected: nil
  actual: com.jcraft.jsch.JSchException: Auth fail
 at com.jcraft.jsch.Session.connect (Session.java:512)
    com.jcraft.jsch.Session.connect (Session.java:183)
    clj_ssh.ssh$connect.invoke (ssh.clj:327)
    jepsen.control$session.invoke (control.clj:186)
    clojure.lang.AFn.applyToHelper (AFn.java:154)
    clojure.lang.AFn.applyTo (AFn.java:144)
    clojure.core$apply.invoke (core.clj:624)
    jepsen.core$fcatch$wrapper__2857.doInvoke (core.clj:39)
    clojure.lang.RestFn.invoke (RestFn.java:408)
    clojure.core$pmap$fn__6328$fn__6329.invoke (core.clj:6466)
    clojure.core$binding_conveyor_fn$fn__4145.invoke (core.clj:1910)
    clojure.lang.AFn.call (AFn.java:18)
    java.util.concurrent.FutureTask.run (FutureTask.java:266)
    java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
    java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
    java.lang.Thread.run (Thread.java:745)

Ran 1 tests containing 1 assertions.
0 failures, 1 errors.
Tests failed.
brianshumate commented 9 years ago

For some reason jsch is not using keys and is using root user password — from /var/log/auth.log on n1:

May  1 14:32:11 debian-800-jessie sshd[1646]: pam_unix(sshd:session): session opened for user root by (uid=0)
May  1 14:34:32 debian-800-jessie sshd[1656]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=192.168.122.10  user=root
May  1 14:34:34 debian-800-jessie sshd[1656]: Failed password for root from 192.168.122.10 port 45919 ssh2
May  1 14:34:34 debian-800-jessie sshd[1656]: error: Received disconnect from 192.168.122.10: 3: com.jcraft.jsch.JSchException: Auth fail [preauth]

Setting:

PermitRootLogin yes

in /etc/ssh/sshd_config on the test nodes solves this issue but might not be the best idea security-wise.

ZeroGraviti commented 9 years ago

/etc/sshd_config doesn;t exist in each of the created n1..5 lxc containers. Did you create them manually or did I miss something ? If you can provide the steps in order, for any prep before this sshd config fix, it would help. I am still not able to run the test (e.g., the aerospike one) successfully. Same old jsch auth fail error.

brianshumate commented 9 years ago

The file is actually /etc/ssh/sshd_config.

Change:

PermitRootLogin without-password

to

PermitRootLogin yes
ZeroGraviti commented 9 years ago

Ah, ok. I missed that. How did you edit this file since "vi" is not installed, so something like sed ? Also, what does this setting do; I mean do we have to create a password for the root user in each of the nodes n1......5, as well ?

brianshumate commented 9 years ago

My jepsen_init.yml playbook sets the root user password.

If you're using this project with Vagrant, you can do the following to change it by hand:

vagrant ssh n0
sudo su -
passwd root
ZeroGraviti commented 9 years ago

On the main console node stuck here currently -> stat: cannot stat '/var/cache/apt/pkgcache.bin': No such file or directory