Open cariaso opened 8 years ago
If I have to guess I would say that this is where it happens. https://github.com/datacratic/StarCluster/blob/000c041a9f71ed8099f461e6af9b145f1f654310/starcluster/plugins/boto.py#L43
not sure how I missed that, but yes it seems more than likely.
2 possible solutions come to mind.
do you have a preference for either?
I would encapsulate the first mssh.switch_user
in a with
block that calls back mssh.switch_user
with root whenever it leaves the scope. That way it would go back to root
whether exceptions are encountered or not.
@cariaso Do you run into this problem if you switch the order of plugins? i.e. run efs before configuring boto for sgeadmin.
No. That is the basis of my current work around.
On Jul 30, 2016 8:15 AM, "vasisht" notifications@github.com wrote:
@cariaso https://github.com/cariaso Do you run into this problem if you switch the order of plugins? i.e. run efs before configuring boto for sgeadmin.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/datacratic/StarCluster/issues/53#issuecomment-236316531, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHpksAR0wEpVm1LH6JV1DwoLGk_tm7Oks5qaomTgaJpZM4JTYey .
PLUGINS = whoami,boto,whoami,efs,whoami
51 and #52 relate to EFS, but I was having problems with it, which I believe I've isolated as being caused by the boto plugin.
This set of log messages seems to confirm it for me.
[ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:42,929 >>> Configuring passwordless ssh for sgeadmin [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:43,334 >>> Running plugin whoami [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:43,335 >>> Running whoami plugin [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:43,345 >>> whoami?: root [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:43,425 >>> Running plugin boto [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:43,649 >>> Installing AWS credentials for user: sgeadmin [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:43,650 >>> Installing current credentials to: /home/sgeadmin/.boto [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:43,728 >>> Running plugin whoami [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:43,728 >>> Running whoami plugin [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:43,739 >>> whoami?: sgeadmin [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:43,813 >>> Running plugin efs [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:43,904 >>> Configuring EFS for sg-f00b038b [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:44,063 >>> Authorizing EFS security group [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:44,483 >>> Authorizing EFS security group [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:44,832 >>> Mounting efs on all nodes [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:44,833 >>> Mounting efs on <Node: master (i-0c26c566b57535ead)> [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:45,018 !!! ERROR - Error occured while running plugin 'efs': [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:45,018 !!! ERROR - remote command 'source /etc/profile && mount -t nfs4 [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:45,018 !!! ERROR - -ominorversion=1 [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:45,018 !!! ERROR - us-east-1a.fs-1ca36455.efs.us-east-1.amazonaws.com:/ [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:45,018 !!! ERROR - /tmp/efs' failed with status 1: [ec2-54-198-147-85.compute-1.amazonaws.com] out: 2016-07-23 15:39:45,018 !!! ERROR - mount: only root can do that
Perhaps someone else can replicate and confirm?
I've looked at the source of starcluster/plugins/boto.py and it's not obvious to me what's causing the trouble.