brightbox / libcloud

Mirror of Apache libcloud (incubating)
Apache License 2.0
1 stars 0 forks source link

support the deploy_node method #1

Open johnl opened 13 years ago

johnl commented 13 years ago

http://ci.apache.org/projects/libcloud/apidocs/libcloud.base.NodeDriver.html#deploy_node

Looks like this only needs overriding if we have a custom password interface, which we don't, so this should really just work already but it was reported by @ogrisel on Twitter, so needs investigating.

ogrisel commented 13 years ago

The exact error message is:

Traceback (most recent call last):
  File "tools/demo_ipython_cloud.py", line 24, in <module>
    node = conn.deploy_node(name='test', image=images[0], size=sizes[0], deploy=msd)
  File "/home/ogrisel/coding/libcloud/libcloud/compute/base.py", line 536, in deploy_node
    'deploy_node not implemented for this driver')
NotImplementedError: deploy_node not implemented for this driver

It seems that the deploy_node method inherited from the base class expects that the authentication method to the nodes should be declared in a features dict for the create_node operation. By default it is empty. libcloud can use either a fixed account password, a create_node-time generated password or a ssh public key.

By scouting the brightbox doc it seems that the ssh key based authorization is supported for first time connection to the vm (which is good IMHO) but I don't find any reference to it in the BrightboxDriver nor in the brightbox web management interface where the ssh public key is supposed to be provided to the brightbox infracstructure.

johnl commented 13 years ago

Ah ic. It's provided via the user model - the user sets their ssh key and then the servers they create from then on get the ssh key. Not sure if we're supporting that in libcloud yet - I'll get our libcloud dev to look in asap (there are some other API updates pending too).

ogrisel commented 12 years ago

Any update on this issue?

NeilW commented 12 years ago

Logged this upstream at https://issues.apache.org/jira/browse/LIBCLOUD-179

along with the fix.

ogrisel commented 12 years ago

Thanks very much. I recently started to work on an experimental parallel machine learning toolbox and this fix will come very handy to run on brightbox rather than EC2 and Rackspace cloud.

ogrisel commented 12 years ago

I tried the patch and I get a timeout while trying to deploy stuff. I basically use the same script as:

https://libcloud.apache.org/getting-started.html#example-bootstrapping-puppet-on-a-node

but with BRIGHTBOX instead of RACKSPACE (I checked that it works with RACKSPACE). Would the lack of public_ips in a newly deployed brightbox node be the cause of this issue?

NeilW commented 12 years ago

Looks like deploy_node has a pretty basic implementation that rips out the first public ip.

No apparent differentiation between v6 and v4 either. Nice.

Our auto allocated public ip address is an IPv6 address.

Looks like the deploy node will have to be improved as well.

ogrisel commented 12 years ago

I guess the v6 vs v4 differentiation can be done without changing the data model using a micro-parsing utility function when necessary. Not dropping the public IP from the node metadata would be nice though :) It seems that the same issue occurs with create_node. I don't think it's specific to deploy_node.

NeilW commented 12 years ago

I've brought this repo up to date with the upstream master and created a new branch 'full-node-info' which returns all the details supplied by the Brightbox api.

The public ip list now contains the automatically allocated IPv6 address - after any IPv4 cloud ips.

In theory with this branch deploy should now work - as long as you have IPv6.

NeilW commented 12 years ago

deploy_node requires a key without a password on it. 'deploy node' won't work with password protected keys or ssh_agent.

If you do the error messages you get back are misleading...

ogrisel commented 12 years ago

Indeed I get a:

DeploymentError: <DeploymentError: node=srv-vv7n7, error=not a valid DSA private key file>

When running your branch although I explicitly passed the ssh_key argument to deploy_node. It works with the RACKSPACE provider though. Or maybe it is silently using the password based authentication instead of the provided ssh_key argument?

NeilW commented 12 years ago

Yeah.

That error message is wrong and lead me right up the garden path. There's a fault in paramiko that gives the wrong error message out if your key is an RSA key and has a password on it.

Rackspace is an OpenStack derivative with 'generates password' feature. That silently adds a password alongside the key settings in the deploy_node command - according to the logic of the deploy_node implementations.

ogrisel commented 12 years ago

Indeed... That's what I saw afterwards by reading the code of the deploy_node method. For the record and the googlability of this issue, here is the stacktrace of paramiko's failure:

In [14]: c = SSHClient(node.public_ip[0], key="/home/ogrisel/.ssh/id_rsa.pub")

In [15]: c.connect()
---------------------------------------------------------------------------
SSHException                              Traceback (most recent call last)
/home/ogrisel/coding/pyrallel/<ipython-input-15-fe870a4d911f> in <module>()
----> 1 c.connect()

/home/ogrisel/coding/libcloud/libcloud/compute/ssh.pyc in connect(self)
    145             conninfo['timeout'] = self.timeout
    146 
--> 147         self.client.connect(**conninfo)
    148         return True
    149 

/usr/lib/python2.7/dist-packages/paramiko/client.pyc in connect(self, hostname, port, username, password, pkey, key_filename, timeout, allow_agent, look_for_keys, compress)
    330         else:
    331             key_filenames = key_filename
--> 332         self._auth(username, password, pkey, key_filenames, allow_agent, look_for_keys)
    333 
    334     def close(self):

/usr/lib/python2.7/dist-packages/paramiko/client.pyc in _auth(self, username, password, pkey, key_filenames, allow_agent, look_for_keys)
    491         # if we got an auth-failed exception earlier, re-raise it
    492         if saved_exception is not None:
--> 493             raise saved_exception
    494         raise SSHException('No authentication methods available')
    495 
NeilW commented 12 years ago

I've pushed a branch 'ssh-agent' which activates the paramiko agent and key searching facilities.

Script deployment assumes /root for the scripts, so you need to be explicit about where they are written.

I used:

script = ScriptDeployment(script="sudo apt-get -y install puppet", name="./test_deploy.sh")

and it all appeared to work with this branch.

ogrisel commented 12 years ago

I was doing the exactly same changes on my local libcloud sandbox :) I really wonder why ssh agent support is disabled by default in libcloud while it's enabled by default in paramiko it-self..

NeilW commented 12 years ago

Asked that precise question here: https://issues.apache.org/jira/browse/LIBCLOUD-182

ogrisel commented 12 years ago

Thanks very much for your help by the way. It seems to work for me with your branch.

NeilW commented 12 years ago

Glad to be of assistance.

I'll try to get this all upsteam now.