tobami / littlechef

Cook with Chef without a Chef Server
Apache License 2.0
472 stars 71 forks source link

Problems running #196

Closed adamdunkley closed 10 years ago

adamdunkley commented 10 years ago

I've been trying to wrack my brains over this one for a while, it's only just started happening. It seems fine when doing every other action except for synchronising the cookbooks etc.

Adams-MacBook-Pro-2:chef adam$ fix node:[REDACTED HOST]

== Configuring [REDACTED HOST] ==
Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fabric/main.py", line 743, in main
    *args, **kwargs
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fabric/tasks.py", line 405, in execute
    results['<local-only>'] = task.run(*args, **new_kwargs)
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fabric/tasks.py", line 171, in run
    return self.wrapped(*args, **kwargs)
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/littlechef/runner.py", line 131, in node
    execute(_node_runner)
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fabric/tasks.py", line 368, in execute
    multiprocessing
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fabric/tasks.py", line 264, in _execute
    return task.run(*args, **kwargs)
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fabric/tasks.py", line 171, in run
    return self.wrapped(*args, **kwargs)
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/littlechef/runner.py", line 156, in _node_runner
    chef.sync_node(node)
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/littlechef/chef.py", line 82, in sync_node
    solo.configure(current_node)
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/littlechef/solo.py", line 104, in configure
    mode=0400)
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fabric/contrib/files.py", line 148, in upload_template
    mode=mode
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fabric/network.py", line 639, in host_prompting_wrapper
    return func(*args, **kwargs)
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fabric/operations.py", line 346, in put
    ftp = SFTP(env.host_string)
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/fabric/sftp.py", line 30, in __init__
    self.ftp = connections[host_string].open_sftp()
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/paramiko/client.py", line 345, in open_sftp
    return self._transport.open_sftp_client()
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/paramiko/transport.py", line 711, in open_sftp_client
    return SFTPClient.from_transport(self)
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/paramiko/sftp_client.py", line 101, in from_transport
    chan.invoke_subsystem('sftp')
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/paramiko/channel.py", line 235, in invoke_subsystem
    self._wait_for_event()
  File "/usr/local/Cellar/python/2.7.6/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/paramiko/channel.py", line 1085, in _wait_for_event
    raise e
paramiko.ssh_exception.SSHException: Channel closed.
Disconnecting from [REDACTED HOST]... done.
Adams-MacBook-Pro-2:chef adam$ fix --version
LittleChef 1.6.1

Tried it with both a private key and a password that should both be right and the user has sudo rights.

Anything else I can provide for the problem? (host is Ubuntu 12.04)

tobami commented 10 years ago

.open_sftp() looks like a file sync problem caused by not having permissions, read-only file system or similar things.

What are the fabric and paramiko versions?

Something else you could try is fix node:[REDACTED HOST] ssh:"cat /etc/issue" If it works, it means login works, and it really is transferring/saving files that is failing.

adamdunkley commented 10 years ago

Seems to go ok:

Adams-MacBook-Pro-2:chef adam$ fix node:[REDACTED HOST] ssh:"cat /etc/issue"

Executing the command 'cat /etc/issue' on node [REDACTED HOST]...
Ubuntu 12.04.4 LTS \n \l

Done.
Disconnecting from [REDACTED HOST]... done.

As for the fabric/paramiko versions:

Adams-MacBook-Pro-2:chef adam$ pip list
argparse (1.1)
distribute (0.6.36)
Django (1.5.5)
ecdsa (0.11)
Fabric (1.8.3)
git-remote-helpers (0.1.0)
littlechef (1.6.1)
paramiko (1.12.3)
pip (1.5.4)
pycrypto (2.6.1)
setuptools (0.6c11)
simplejson (3.3.0)
vboxapi (1.0)
wsgiref (0.1.2)

So, permissions issues. I'll look at some env stuff with the ssh command …

adamdunkley commented 10 years ago

Hmm, sudo definitely seems to be working so I can't imagine what it might be unless the sudo stuff within littlechef isn't actually happening:

Adams-MacBook-Pro-2:chef adam$ fix node:[REDACTED HOST] ssh:"touch /test-34567876543"

Executing the command 'touch /test-34567876543' on node [REDACTED HOST]...
touch: cannot touch `/test-34567876543': Permission denied

Done.
Disconnecting from [REDACTED HOST]... done.
Adams-MacBook-Pro-2:chef adam$ fix node:[REDACTED HOST] ssh:"sudo touch /test-34567876543"

Executing the command 'sudo touch /test-34567876543' on node [REDACTED HOST]...

Done.
Disconnecting from [REDACTED HOST]... done.
tobami commented 10 years ago

And with sudo? fix node:[REDACTED HOST] ssh:"sudo cat /etc/issue"

Looking closer at the stack trace, it happens when updating /etc/chef/solo.rb. Can you have a look at the directory and its permissions? In any case we can add better error handling there: https://github.com/tobami/littlechef/blob/1.6.1/littlechef/solo.py#L105

I'll mark this as a bug.

adamdunkley commented 10 years ago
adam@[REDACTED HOST] [11:45:28] [~]
-> % ll /etc/chef
total 8.0K
-rw-r----- 1 root root   0 Mar 21 14:53 client.rb
-rw--w---- 1 root root 135 Mar 21 15:51 node.json
-r-------- 1 root root 234 Mar 21 15:51 solo.rb

That'll be it. Thanks a lot for helping me debug this, would you like me to raise a separate issue about the error handling?

adamdunkley commented 10 years ago

Ok, so I am no longer convinced this is permission related. I am still getting the problem with sensible permissions in that folder (and the parent directory). Also the error coming back from paramiko doesn't really marry with a permission error. Surely it would have more sensible error messages than "Channel closed." if it was some error with putting the information. It also doesn't really look like it gets very far in to the transport …

adamdunkley commented 10 years ago

Turns out the paramiko error was actually quite accurate:

Apr  5 12:44:04 [REDACTED HOST] sshd[6583]: Accepted publickey for adam from [REDACTED IP] port 53366 ssh2
Apr  5 12:44:04 [REDACTED HOST] sshd[6583]: pam_unix(sshd:session): session opened for user adam by (uid=0)
Apr  5 12:44:05 [REDACTED HOST] sudo:     adam : TTY=pts/1 ; PWD=/home/adam ; USER=root ; COMMAND=/bin/bash -l -c chown -R adam /tmp/chef-solo
Apr  5 12:44:05 [REDACTED HOST] sudo: pam_unix(sudo:session): session opened for user root by adam(uid=1000)
Apr  5 12:44:05 [REDACTED HOST] sudo: pam_unix(sudo:session): session closed for user root
Apr  5 12:44:05 [REDACTED HOST] sshd[6743]: subsystem request for sftp by user adam
Apr  5 12:44:05 [REDACTED HOST] sshd[6743]: subsystem request for sftp failed, subsystem not found
Apr  5 12:44:05 [REDACTED HOST] sshd[6583]: pam_unix(sshd:session): session closed for user adam

I can definitely fix this now I know what it is … sorry to have taken up your time!

adamdunkley commented 10 years ago

I am not the only person who has had this problem: https://gist.github.com/bradmontgomery/3954511

tobami commented 10 years ago

So how did you solve it in the end? And what do you think we can do on LittleChef's side so that users see a more meaningful message that helps them know what kind of error it is?

adamdunkley commented 10 years ago

So you need to have a SFTP subsystem enabled which, annoyingly, the opscode openssh doesn't enable by default, so you need to add the following attribute (or update it manually if you've not yet cooked and there's no subsystem enabled):

node.default['openssh']['server']['subsystem'] = 'sftp /usr/lib/openssh/sftp-server';

Which only solves it for ubuntu.

I am not sure how this can be made clearer what is happening in littlechef because the problem is quite a low level SSH problem that really is quite unrelated to littlechef. I think paramiko could do a better job at describing the circumstance that seems to be going wrong, but it could actually be that it cannot get access to why exactly its connection drops. Maybe just having this issue and it being searchable will be enough that if someone gets paramiko.ssh_exception.SSHException: Channel closed. they'll find their way here?

adamdunkley commented 10 years ago

Also, just want to give some props to @bradmontgomery. Wouldn't have solved it as quickly as I did without his gist being on google. :+1:

tobami commented 10 years ago

Great, thanks. I created an issue to implement the improved error handling.

bradmontgomery commented 10 years ago

:thumbsup: Happy that gist helped! :smile: