jtriley / StarCluster

StarCluster is an open source cluster-computing toolkit for Amazon's Elastic Compute Cloud (EC2).
http://star.mit.edu/cluster
GNU Lesser General Public License v3.0
583 stars 313 forks source link

Error when starting first cluster #592

Closed apratim88 closed 7 years ago

apratim88 commented 7 years ago

Hi ! I am new to using StarCluster I was following the online instructions but I get this error when trying to launch my first cluster.

AttributeError: 'EntryPoint' object has no attribute 'resolve'

Please help!

Thanks ! :-)

vasisht commented 7 years ago

See https://github.com/jtriley/StarCluster/issues/590

xiongjie494 commented 7 years ago

I am a new gay too. I use a more complicated way to upgrade setuptools to 26.1.1 on Ubuntu. Maybe it can be a reference to other linux OS.

  1. download source code of setuptools from https://github.com/pypa/setuptools, unzip it.
  2. Install some necessary package. sudo apt-get install python-dev libffi-dev libssl-dev
  3. Install setuptools cd the directory of setuptools source code $ python bootstrap.py $ sudo python setup.py install
apratim88 commented 7 years ago

@xiongjie494

Hi

I did exactly what you said. But I still get the following error:

StarCluster - (http://star.mit.edu/cluster) (v. 0.95.6) Software Tools for Academics and Researchers (STAR) Please submit bug reports to starcluster@mit.edu

Using default cluster template: smallcluster Validating cluster template settings... Cluster template settings are valid Starting cluster... Launching a 2-node cluster... Creating security group @sc-cluster1... Creating placement group @sc-cluster1... Reservation:r-077a9a68a7003574f Waiting for instances to propagate... 2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
Waiting for cluster to come up... (updating every 30s) Waiting for all nodes to be in a 'running' state... 2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
Waiting for SSH to come up on all nodes... 2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
!!! ERROR - error occurred in job (id=master): 'EntryPoint' object has no attribute 'resolve' Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/threadpool.py", line 48, in run job.run() File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/threadpool.py", line 75, in run r = self.method(_self.args, *_self.kwargs) File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/cluster.py", line 1429, in self.pool.map(lambda n: n.wait(interval=self.refresh_interval), nodes, File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/node.py", line 1019, in wait while not self.is_up(): File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/node.py", line 1025, in is_up if not self.is_ssh_up(): File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/node.py", line 1010, in is_ssh_up return self.ssh.transport is not None File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/node.py", line 1070, in ssh private_key=self.key_location) File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/sshutils.py", line 78, in init self._pkey = self.load_private_key(private_key, private_key_pass) File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/sshutils.py", line 95, in load_private_key pkey = self._load_rsa_key(private_key, private_key_pass) File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/sshutils.py", line 184, in _load_rsa_key passphrase=private_key_pass) File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/sshutils.py", line 816, in get_rsa_key password=passphrase) File "/usr/local/lib/python2.7/dist-packages/paramiko-2.0.2-py2.7.egg/paramiko/pkey.py", line 217, in from_private_key key = cls(file_obj=file_obj, password=password) File "/usr/local/lib/python2.7/dist-packages/paramiko-2.0.2-py2.7.egg/paramiko/rsakey.py", line 42, in init self._from_private_key(file_obj, password) File "/usr/local/lib/python2.7/dist-packages/paramiko-2.0.2-py2.7.egg/paramiko/rsakey.py", line 168, in _from_private_key self._decode_key(data) File "/usr/local/lib/python2.7/dist-packages/paramiko-2.0.2-py2.7.egg/paramiko/rsakey.py", line 173, in _decode_key data, password=None, backend=default_backend() File "/usr/local/lib/python2.7/dist-packages/cryptography-1.5-py2.7-linux-x86_64.egg/cryptography/hazmat/backends/init.py", line 35, in default_backend _default_backend = MultiBackend(_available_backends()) File "/usr/local/lib/python2.7/dist-packages/cryptography-1.5-py2.7-linux-x86_64.egg/cryptography/hazmat/backends/init.py", line 22, in _available_backends "cryptography.backends" AttributeError: 'EntryPoint' object has no attribute 'resolve'

error occurred in job (id=node001): 'EntryPoint' object has no attribute 'resolve' Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/threadpool.py", line 48, in run job.run() File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/threadpool.py", line 75, in run r = self.method(_self.args, *_self.kwargs) File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/cluster.py", line 1429, in self.pool.map(lambda n: n.wait(interval=self.refresh_interval), nodes, File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/node.py", line 1019, in wait while not self.is_up(): File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/node.py", line 1025, in is_up if not self.is_ssh_up(): File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/node.py", line 1010, in is_ssh_up return self.ssh.transport is not None File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/node.py", line 1070, in ssh private_key=self.key_location) File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/sshutils.py", line 78, in init self._pkey = self.load_private_key(private_key, private_key_pass) File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/sshutils.py", line 95, in load_private_key pkey = self._load_rsa_key(private_key, private_key_pass) File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/sshutils.py", line 184, in _load_rsa_key passphrase=private_key_pass) File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.95.6-py2.7.egg/starcluster/sshutils.py", line 816, in get_rsa_key password=passphrase) File "/usr/local/lib/python2.7/dist-packages/paramiko-2.0.2-py2.7.egg/paramiko/pkey.py", line 217, in from_private_key key = cls(file_obj=file_obj, password=password) File "/usr/local/lib/python2.7/dist-packages/paramiko-2.0.2-py2.7.egg/paramiko/rsakey.py", line 42, in init self._from_private_key(file_obj, password) File "/usr/local/lib/python2.7/dist-packages/paramiko-2.0.2-py2.7.egg/paramiko/rsakey.py", line 168, in _from_private_key self._decode_key(data) File "/usr/local/lib/python2.7/dist-packages/paramiko-2.0.2-py2.7.egg/paramiko/rsakey.py", line 173, in _decode_key data, password=None, backend=default_backend() File "/usr/local/lib/python2.7/dist-packages/cryptography-1.5-py2.7-linux-x86_64.egg/cryptography/hazmat/backends/init.py", line 35, in default_backend _default_backend = MultiBackend(_available_backends()) File "/usr/local/lib/python2.7/dist-packages/cryptography-1.5-py2.7-linux-x86_64.egg/cryptography/hazmat/backends/init.py", line 22, in _available_backends "cryptography.backends" AttributeError: 'EntryPoint' object has no attribute 'resolve'

!!! ERROR - Oops! Looks like you've found a bug in StarCluster


But when I do starcluster listclusters, it shows the cluster1 is running.

starcluster listclusters StarCluster - (http://star.mit.edu/cluster) (v. 0.95.6) Software Tools for Academics and Researchers (STAR) Please submit bug reports to starcluster@mit.edu


cluster1 (security group: @sc-cluster1)

Launch time: 2016-09-18 16:16:17 Uptime: 0 days, 00:00:58 VPC: vpc-87271ee3 Subnet: subnet-066f4262 Zone: us-west-2b Keypair: ***** EBS volumes: N/A Cluster nodes: master running i-06f5eb7abad383f64 ec2-54-244-202-14.us-west-2.compute.amazonaws.com node001 running i-049cb14711093834a ec2-54-149-173-27.us-west-2.compute.amazonaws.com Total nodes: 2


What does this mean ?

Any help is highly appreciated. Thanks in advance!

xiongjie494 commented 7 years ago

I guess you have met the same problem as I have met before, which is the setuptools is not the newest version. You can upgrade it using command 'sudo pip install -U setuptools', which described by vasisht in the #590.

xiongjie494 commented 7 years ago

After I upgrade the setuptools to 26.1.1, the problem is fixed.

vasisht commented 7 years ago

You may have an older version of distribute/setuptools installed. Can you try uninstalling those first?

pip uninstall distribute

apratim88 commented 7 years ago

I did exactly what you said. First I uninstalled the older version of distribute/setuptools. Then I installed the 26.1.1 version. Then I launched a new cluster. All errors were gone. The cluster launched successfully:

Setting up NFS took 0.055 mins Configuring passwordless ssh for root Configuring passwordless ssh for sgeadmin Running plugin starcluster.plugins.sge.SGEPlugin Configuring SGE... Configuring NFS exports path(s): /opt/sge6 Mounting all NFS export path(s) on 1 worker node(s) 1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
Setting up NFS took 0.111 mins Installing Sun Grid Engine... 1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
Creating SGE parallel environment 'orte' 2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
Adding parallel environment 'orte' to queue 'all.q' Configuring cluster took 1.824 mins Starting cluster took 3.079 mins

Next I face this problem: I tried logging in to the cluster using

starcluster sshmaster cluster2 -u sgeadmin

But it does not allow me to login. I get the following WARNING: UNPROTECTED PRIVATE KEY FILE Permissions 0444 for '/home/_//**.pem' are too open. It is required that your private key files are NOT accessible by others. This private key will be ignored. Load key "/home/_//**.pem": bad permissions Permission denied (publickey).

How do I solve this ?

cariaso commented 7 years ago

chmod 600 /path/to/your/key.pem

which seems to be

chmod 600 /home///.pem

although that's a weird looking filename and with the bolding it looks like you probably stripped it out intentionally.

On Tue, Sep 20, 2016 at 12:17 PM, apratim88 notifications@github.com wrote:

I did exactly what you said. First I uninstalled the older version of distribute/setuptools. Then I installed the 26.1.1 version. Then I launched a new cluster. All errors were gone. The cluster launched successfully:

Setting up NFS took 0.055 mins Configuring passwordless ssh for root Configuring passwordless ssh for sgeadmin Running plugin starcluster.plugins.sge.SGEPlugin Configuring SGE... Configuring NFS exports path(s): /opt/sge6 Mounting all NFS export path(s) on 1 worker node(s) 1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%

Setting up NFS took 0.111 mins Installing Sun Grid Engine... 1/1 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%

Creating SGE parallel environment 'orte' 2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%

Adding parallel environment 'orte' to queue 'all.q' Configuring cluster took 1.824 mins Starting cluster took 3.079 mins

Next I face this problem: I tried logging in to the cluster using

starcluster sshmaster cluster2 -u sgeadmin

But it does not allow me to login. I get the following WARNING: UNPROTECTED PRIVATE KEY FILE Permissions 0444 for '/home///

.pem' are too open. It is required that your private key files are NOT accessible by others. This private key will be ignored. Load key "/home// /.pem": bad permissions Permission denied (publickey).

How do I solve this ?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jtriley/StarCluster/issues/592#issuecomment-248261087, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHpkhvk_IwSuyYdzhcI8ppZ7zvO8X73ks5qr7K0gaJpZM4J3lzf .

Mike Cariaso http://www.cariaso.com

apratim88 commented 7 years ago

Thanks for the help. Everything is sorted out now. It works great !