nathanmarz / storm-deploy

One click deploy for Storm clusters on AWS
Other
516 stars 148 forks source link

storm-deploy exception #36

Closed BenMacKenzie closed 11 years ago

BenMacKenzie commented 11 years ago

I've tried running storm-deploy a few times and it is consistently failing both from my laptop (osx) and from an ec2 rhel 6.4 instance. The exception occurs directly after storm-deploy appears to be starting ganglia on all the nodes. The last few messages before the exception are as follows:

DEBUG core - p-f-s server environment null DEBUG core - p-f-s server environment null INFO core - parallel-apply-phase :pallet.phase/post-exec for :supervisor-stormClusterX INFO core - parallel-apply-phase-to-target :node :pallet.phase/post-exec for :supervisor-stormClusterX with 2 nodes DEBUG core - apply-phase-to-node: phase :pallet.phase/post-exec group :supervisor-stormClusterX target 54.226.164.194 DEBUG core - apply-phase-to-node: phase :pallet.phase/post-exec group :supervisor-stormClusterX target 54.224.119.37 INFO execute - execute-with-ssh on supervisor-stormClusterX "54.226.164.194" INFO execute - execute-with-ssh on supervisor-stormClusterX "54.224.119.37" INFO execute - Admin user storm /Users/benmackenzie/.ssh-aws/BenMac.pem /Users/benmackenzie/.ssh-aws/BenMac.pem.pub INFO execute - Admin user storm /Users/benmackenzie/.ssh-aws/BenMac.pem /Users/benmackenzie/.ssh-aws/BenMac.pem.pub DEBUG provision - Finished post-configure and exec phases INFO provision - Attaching to Available Cluster... DEBUG compute - >> listing node details matching(ALWAYS_TRUE) DEBUG compute - << list(10) DEBUG compute - >> listing node details matching(ALWAYS_TRUE) DEBUG compute - << list(10) DEBUG compute - >> listing node details matching(ALWAYS_TRUE) DEBUG compute - << list(10) DEBUG compute - >> listing node details matching(ALWAYS_TRUE) DEBUG compute - << list(10) ERROR logging - Exception in thread "main" ERROR logging - java.lang.RuntimeException: request: POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1; cause: java.lang.NullPointerException (NO_SOURCE_FILE:1)

**config.clj is as follows:

(defpallet :services { :default { :blobstore-provider "aws-s3" :provider "aws-ec2" :environment {:user {:username "storm" ; this must be "storm" :private-key-path "~/.ssh-aws/BenMac.pem" :public-key-path "~/.ssh-aws/BenMac.pem.pub"} :aws-user-id "xxxxxx"} :identity "xxxxxxx" :credential "xxxxxxx" :jclouds.regions "us-east-1" } })

any ideas?

BenMacKenzie commented 11 years ago

Just a quick follow up:

notwithstanding the fact the storm-deploy threw an exception and did not complete, I am able to submit and execute a topology.

Also, I had to manually modify the aws security group nimbus is assigned to in order to access the ui.

jsquirrelz commented 11 years ago

I'm having the same issue. As soon as the attaching phase begins, it throws a RuntimeException:

ERROR logging - Exception in thread "main" ERROR logging - java.lang.RuntimeException: request: POST https://ec2.us-west-1.amazonaws.com/ HTTP/1.1; cause: java.lang.NullPointerException (NO_SOURCE_FILE:1)

Unfortunately I made the edit to the Nimbus security group (added port 8080) and tried to visit the UI, but it threw an internal server error. Tried submitting a topology as well, turns out Nimbus never started (or crashed on startup?).

rjtg commented 11 years ago

Same issue for me

gworley3 commented 11 years ago

I wrote a reply to this on the mailing list: something changed with Amazon's security group api responses because it is breaking the version of jclouds used in storm-deploy. They made the change a couple weeks ago. Not sure what has changed that is incompatible with the version of jclouds used by storm-deploy, but a workaround that we're using is to comment out everything that sets security groups and manage them manually via the ec2 web ui after we bring up the cluster.

kerinin commented 11 years ago

Same here