amplab / docker-scripts

Dockerfiles and scripts for Spark and Shark Docker images
261 stars 102 forks source link

Cannot get a simple test Java app running in OSX/Vagrant environemnt #34

Open lbustelo opened 10 years ago

lbustelo commented 10 years ago

I have a Vagrant setup running the docker scripts using docker 0.9. I also have a simple maven project that tries to replicate your Shell example. I keep getting failures on the submission.

Java Main is:

public class SparkMain {

    protected static String master = "spark://master:7077"; // change to your master URL
    protected static String sparkHome = "/opt/spark-0.9.0";

    public static void main(String [] args ){

        JavaSparkContext sc = new JavaSparkContext(master, "Test App",
                sparkHome, JavaSparkContext.jarOfClass(SparkMain.class));

        JavaRDD<String> file = sc.textFile("hdfs://master:9000/user/hdfs/test.txt");
        //JavaRDD<String> file = sc.textFile("README.md");
        System.out.println(file.count());
        sc.stop();
    }
}

When running the test with "README.md", I see an error that its cannot find "/vagrant/README.md". In that case I don't understand why Spark things that the file is relative to the vagrant vm and not the docker containers. When I use the hdfs url, then I just get a lot of these:

14/05/09 00:05:50 INFO scheduler.DAGScheduler: Missing parents: List()
14/05/09 00:05:50 INFO scheduler.DAGScheduler: Submitting Stage 0 (MappedRDD[1] at textFile at SparkMain.java:19), which has no missing parents
14/05/09 00:05:50 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from Stage 0 (MappedRDD[1] at textFile at SparkMain.java:19)
14/05/09 00:05:50 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
14/05/09 00:05:55 INFO client.AppClient$ClientActor: Executor updated: app-20140509000548-0012/0 is now FAILED (Command exited with code 1)
14/05/09 00:05:55 INFO cluster.SparkDeploySchedulerBackend: Executor app-20140509000548-0012/0 removed: Command exited with code 1
14/05/09 00:05:55 INFO client.AppClient$ClientActor: Executor added: app-20140509000548-0012/3 on worker-20140508215925-worker3-43556 (worker3:43556) with 1 cores
14/05/09 00:05:55 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20140509000548-0012/3 on hostPort worker3:43556 with 1 cores, 512.0 MB RAM

I've tried several things:

  1. Have the nameserver address in the resolve.conf file
  2. Tried creating an hdfs user in the vagrant vm and run mvm exec under that user (did not work)
AndreSchumacher commented 10 years ago

@lbustelo how do you run your jar, on the master when you log into using ssh?

lbustelo commented 10 years ago

I've since gotten past this issue, and to be honest… I've lost track of how I've fixed certain things. Documentation is very specific around using the Spark Shell and there is not much around running driver programs directly on the host machine. I'm using these scripts as my dev setup and I'm building and executing within the Vagrant host. The recipe right now to get working is:

  1. Edit /etc/resolve.conf on the host to add the name server
  2. Add host box to the /temp/dnsdir* used by name server
  3. Set spark.driver.host to the IP of the host box. Setting the hostname did not work, but forgot the issue. Now, for actually running the hdfs example I realized that it was the spark shell container that was running scripts to populate hdfs. Still think that I had to run my Main under the HDFS user to get it to work.

In summary, I think that my setup is not that uncommon but the README is lacking some details.