nchammas / flintrock

A command-line tool for launching Apache Spark clusters.
Apache License 2.0
638 stars 116 forks source link

Fix for spark 1.6.x #165

Closed dm-tran closed 7 years ago

dm-tran commented 7 years ago

This PR makes the following changes:

I tested this PR using command flintrock launch with Spark 1.6.3.

Scripts "sbin/start-master.sh" and "sbin/start-slaves.sh" in 1.6.x use SPARK_MASTER_IP. This has been changed by the following PR in Spark 2.0.0 : https://github.com/apache/spark/pull/13543/files

The associated JIRA is https://issues.apache.org/jira/browse/SPARK-15806

nchammas commented 7 years ago

I can launch and use Spark 1.6.3 clusters just fine with Flintrock. Are you having issues? What issues are you having, exactly?

Also, reading through the discussion on https://github.com/apache/spark/pull/13543, it looks like Flintrock is doing the correct thing by using SPARK_MASTER_HOST, which was supported and used before Spark 2.0. That PR just clarified that _HOST is preferred and _IP is deprecated.

dm-tran commented 7 years ago

Thanks for your answer @nchammas.

The issue I have with Spark 1.6.3 is that slaves are not recognized nor displayed in Spark UI:

./flintrock --config spark16.yaml launch cluster-test
Requesting 2 spot instances at a max price of $0.1...
0 of 2 instances granted. Waiting...
All 2 instances granted.
[172.4.95.149] SSH online.
[172.4.81.85] SSH online.
[172.4.95.149] Configuring ephemeral storage...
[172.4.81.85] Configuring ephemeral storage...
[172.4.95.149] Installing Java 1.8...
[172.4.81.85] Installing Java 1.8...
[172.4.95.149] Installing HDFS...
[172.4.81.85] Installing HDFS...
[172.4.95.149] Installing Spark...
[172.4.81.85] Installing Spark...
[172.4.81.85] Configuring HDFS master...
[172.4.81.85] Configuring Spark master...
HDFS online.
Spark Health Report:
  * Master: ALIVE
  * Workers: 0
  * Cores: 0
  * Memory: 0.0 GB
launch finished in 0:03:21.

It turns out that this is due to a configuration problem of our VPC. By default, machines in our VPC can only communicate with each other using IPs.

In start-slaves.sh in Spark 1.6, the default value of SPARK_MASTER_IP is hostname :

if [ "$SPARK_MASTER_IP" = "" ]; then
  SPARK_MASTER_IP="`hostname`"
fi

That's why slaves were not recognized.

Sorry for the disturbance, I am closing the issue.

nchammas commented 7 years ago

No worries. Glad you figured things out.

FWIW SPARK_MASTER_HOST can also be set to an IP address, according to https://github.com/apache/spark/pull/13543.