Closed vasion closed 7 years ago
@vasion Could you show me the command you executed? Also, could you check that your terminal is using the same identity as the one you used to login aws console? By default EMR will only show the cluster information to the person who created it.
Here is the command.
sbt 'sparkCreateCluster' sparkMonitor
The account in aws is the same as for the credentials. I see other clusters my colleagues have created. Btw the Cluster terminated without error appears very fast, idk if this is a clue.
Here is the relevant part of my sbt
libraryDependencies ++= Seq( "com.typesafe.scala-logging" %% "scala-logging" % "3.7.2" % "provided", "org.apache.spark" %% "spark-sql" % "2.2.0" % "provided", "org.apache.spark" %% "spark-streaming" % "2.2.0" % "provided", "org.apache.spark" %% "spark-mllib" % "2.2.0" % "provided", "org.apache.hadoop" % "hadoop-aws" % "2.8.1" % "provided", "com.amazonaws" % "aws-java-sdk" % "1.11.194" % "provided" "org.scalatest" %% "scalatest" % "3.0.3" % "provided", "org.scalamock" %% "scalamock-scalatest-support" % "3.6.0" % "provided" )
//EMR setup for command line clusters
val username = System.getProperty("user.name")
sparkAwsRegion := "us-east-1"
sparkJobFlowInstancesConfig := sparkJobFlowInstancesConfig.value.withEc2KeyName("xxxxxx")
sparkSecurityGroupIds := Some(Seq("sg-xxxxxxx"))
sparkS3JarFolder := s"s3://xxxxxxxxx/$username/"
sparkInstanceCount := 3
sparkClusterName := s"SBTCluster$username"
sparkEmrRelease := "emr-5.8.0"
sparkInstanceType := "m3.xlarge"
//comment away for on-demand cluster //sparkInstanceBidPrice := Some("0.10")
import com.amazonaws.services.elasticmapreduce.model.Application val applications = Seq("Spark", "Ganglia", "Hadoop").map(a =>new Application().withName(a)) sparkRunJobFlowRequest := sparkRunJobFlowRequest.value.withApplications(applications:_*)
import com.amazonaws.services.elasticmapreduce.model.Tag sparkRunJobFlowRequest := sparkRunJobFlowRequest.value.withTags(new Tag("Name", s"SBTCluster-$username"))
sparkRunJobFlowRequest := sparkRunJobFlowRequest.value.withLogUri(s"s3://xxxxxxxxxx/$username/")
@vasion Could you try to add this to your build.sbt
:
sparkRunJobFlowRequest := sparkRunJobFlowRequest.value.withVisibleToAllUsers(true)
Yay. I can see it. And there are errors. I will debug tomorrow and write back.
@vasion If you can see it after adding withVisibleToAllUsers
, it should be the identity problem, maybe you can check if your AWS_ACCESS_KEY_ID
is owned by the IAM User you used to log in AWS web console.
withVisibleToAllUsers allowed my to see the cluster. The reason it was failing was that the security group I was specifying was in another region. Thanks for your help. I really appreciate it.
I am trying to spin up a long running cluster.
I get this:
But the cluster is nowhere to be found in the aws console. It seems like it's not spinning up at all.
Can you point me in the right direction for debugging this?