Asmaa-Ali commented 8 years ago

root@cancerdetector-m:/home/foehu_dna/SparkBWA/build# spark-submit --class SparkBWA --master yarn-client --driver-memory 1500m --executor-memory 1500m --executor-cores 1 --archives bwa.zip --verbose SparkBWA.jar -algorithm mem -reads paired -index /Data/HumanBase/hg38 -partitions 32 ERR000589_1.filt.fastq ERR000589_2.filt.fastq Output_ERR000589
Using properties file: /usr/lib/spark/conf/spark-defaults.conf
Adding default property: spark.executor.extraJavaOptions=-Djava.library.path=./bwa.zip
Adding default property: spark.history.fs.logDirectory=hdfs://cancerdetector-m/user/spark/eventlog
Adding default property: spark.eventLog.enabled=true
Adding default property: spark.driver.maxResultSize=1920m
Adding default property: spark.shuffle.service.enabled=true
Adding default property: spark.yarn.historyServer.address=cancerdetector-m:18080
Adding default property: spark.sql.parquet.cacheMetadata=false
Adding default property: spark.driver.memory=3840m
Adding default property: spark.dynamicAllocation.maxExecutors=10000
Adding default property: spark.scheduler.minRegisteredResourcesRatio=0.0
Adding default property: spark.yarn.am.memoryOverhead=384
Adding default property: spark.yarn.am.memory=2688m
Adding default property: spark.driver.extraJavaOptions=-Xbootclasspath/p:/usr/local/share/google/alpn/alpn-boot-8.1.7.v20160121.jar
Adding default property: spark.master=yarn-client
Adding default property: spark.executor.memory=2688m
Adding default property: spark.eventLog.dir=hdfs://cancerdetector-m/user/spark/eventlog
Adding default property: spark.dynamicAllocation.enabled=true
Adding default property: spark.executor.cores=1
Adding default property: spark.yarn.executor.memoryOverhead=384
Adding default property: spark.dynamicAllocation.minExecutors=1
Adding default property: spark.dynamicAllocation.initialExecutors=10000
Adding default property: spark.akka.frameSize=512
Parsed arguments:
  master                  yarn-client
  deployMode              null
  executorMemory          1500m
  executorCores           1
  totalExecutorCores      null
  propertiesFile          /usr/lib/spark/conf/spark-defaults.conf
  driverMemory            1500m
  driverCores             null
  driverExtraClassPath    null
  driverExtraLibraryPath  null
  driverExtraJavaOptions  -Xbootclasspath/p:/usr/local/share/google/alpn/alpn-boot-8.1.7.v20160121.jar
  supervise               false
  queue                   null
  numExecutors            null
  files                   null
  pyFiles                 null
  archives                file:/home/foehu_dna/SparkBWA/build/bwa.zip
  mainClass               SparkBWA
  primaryResource         file:/home/foehu_dna/SparkBWA/build/SparkBWA.jar
  name                    SparkBWA
  childArgs               [-algorithm mem -reads paired -index /Data/HumanBase/hg38 -partitions 32 ERR000589_1.filt.fastq ERR000589_2.filt.fastq Output_ERR000589]
  jars                    null
  packages                null
  packagesExclusions      null
  repositories            null
  verbose                 true

Spark properties used, including those specified through
 --conf and those from the properties file /usr/lib/spark/conf/spark-defaults.conf:
  spark.yarn.am.memoryOverhead -> 384
  spark.driver.memory -> 1500m
  spark.executor.memory -> 2688m
  spark.yarn.historyServer.address -> cancerdetector-m:18080
  spark.eventLog.enabled -> true
  spark.scheduler.minRegisteredResourcesRatio -> 0.0
  spark.dynamicAllocation.maxExecutors -> 10000
  spark.akka.frameSize -> 512
  spark.executor.extraJavaOptions -> -Djava.library.path=./bwa.zip
  spark.sql.parquet.cacheMetadata -> false
  spark.shuffle.service.enabled -> true
  spark.dynamicAllocation.initialExecutors -> 10000
  spark.dynamicAllocation.minExecutors -> 1
  spark.history.fs.logDirectory -> hdfs://cancerdetector-m/user/spark/eventlog
  spark.yarn.executor.memoryOverhead -> 384
  spark.driver.extraJavaOptions -> -Xbootclasspath/p:/usr/local/share/google/alpn/alpn-boot-8.1.7.v20160121.jar
  spark.eventLog.dir -> hdfs://cancerdetector-m/user/spark/eventlog
  spark.yarn.am.memory -> 2688m
  spark.driver.maxResultSize -> 1920m
  spark.master -> yarn-client
  spark.dynamicAllocation.enabled -> true
  spark.executor.cores -> 1

Main class:
SparkBWA
Arguments:
-algorithm
mem
-reads
paired
-index
/Data/HumanBase/hg38
-partitions
32
ERR000589_1.filt.fastq
ERR000589_2.filt.fastq
Output_ERR000589
System properties:
spark.yarn.am.memoryOverhead -> 384
spark.driver.memory -> 1500m
spark.executor.memory -> 1500m
spark.yarn.historyServer.address -> cancerdetector-m:18080
spark.eventLog.enabled -> true
SPARK_SUBMIT -> true
spark.scheduler.minRegisteredResourcesRatio -> 0.0
spark.dynamicAllocation.maxExecutors -> 10000
spark.akka.frameSize -> 512
spark.sql.parquet.cacheMetadata -> false
spark.executor.extraJavaOptions -> -Djava.library.path=./bwa.zip
spark.app.name -> SparkBWA
spark.shuffle.service.enabled -> true
spark.dynamicAllocation.initialExecutors -> 10000
spark.dynamicAllocation.minExecutors -> 1
spark.history.fs.logDirectory -> hdfs://cancerdetector-m/user/spark/eventlog
spark.yarn.executor.memoryOverhead -> 384
spark.driver.extraJavaOptions -> -Xbootclasspath/p:/usr/local/share/google/alpn/alpn-boot-8.1.7.v20160121.jar
spark.jars -> file:/home/foehu_dna/SparkBWA/build/SparkBWA.jar
spark.yarn.dist.archives -> file:/home/foehu_dna/SparkBWA/build/bwa.zip
spark.submit.deployMode -> client
spark.eventLog.dir -> hdfs://cancerdetector-m/user/spark/eventlog
spark.driver.maxResultSize -> 1920m
spark.yarn.am.memory -> 2688m
spark.master -> yarn-client
spark.dynamicAllocation.enabled -> true
spark.executor.cores -> 1
Classpath elements:
file:/home/foehu_dna/SparkBWA/build/SparkBWA.jar

16/08/05 00:58:41 INFO BwaOptions: JMAbuin:: Received argument: -algorithm
16/08/05 00:58:41 INFO BwaOptions: JMAbuin:: Received argument: mem
16/08/05 00:58:41 INFO BwaOptions: JMAbuin:: Received argument: -reads
16/08/05 00:58:41 INFO BwaOptions: JMAbuin:: Received argument: paired
16/08/05 00:58:41 INFO BwaOptions: JMAbuin:: Received argument: -index
16/08/05 00:58:41 INFO BwaOptions: JMAbuin:: Received argument: /Data/HumanBase/hg38
16/08/05 00:58:41 INFO BwaOptions: JMAbuin:: Received argument: -partitions
16/08/05 00:58:41 INFO BwaOptions: JMAbuin:: Received argument: 32
16/08/05 00:58:41 INFO BwaOptions: JMAbuin:: Received argument: ERR000589_1.filt.fastq
16/08/05 00:58:41 INFO BwaOptions: JMAbuin:: Received argument: ERR000589_2.filt.fastq
16/08/05 00:58:41 INFO BwaOptions: JMAbuin:: Received argument: Output_ERR000589
16/08/05 00:58:42 INFO akka.event.slf4j.Slf4jLogger: Slf4jLogger started
16/08/05 00:58:42 INFO Remoting: Starting remoting
16/08/05 00:58:42 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.132.0.4:54448]
16/08/05 00:58:42 INFO org.spark-project.jetty.server.Server: jetty-8.y.z-SNAPSHOT
16/08/05 00:58:42 INFO org.spark-project.jetty.server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
16/08/05 00:58:42 INFO org.spark-project.jetty.server.Server: jetty-8.y.z-SNAPSHOT
16/08/05 00:58:42 INFO org.spark-project.jetty.server.AbstractConnector: Started SocketConnector@0.0.0.0:32772
16/08/05 00:58:43 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at cancerdetector-m/10.132.0.4:8032
16/08/05 00:58:44 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Submitted application application_1470330263545_0007
16/08/05 00:58:48 INFO BwaInterpreter: JMAbuin:: Starting BWA
16/08/05 00:58:48 INFO BwaInterpreter: JMAbuin::Not sorting in HDFS. Timing: 28482317228669
16/08/05 00:58:48 INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat: Total input paths to process : 1
16/08/05 00:58:48 INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat: Total input paths to process : 1
16/08/05 00:58:48 INFO BwaInterpreter: JMAbuin:: No sort with partitioning
16/08/05 00:58:48 INFO BwaInterpreter: JMAbuin:: Repartition with no sort
16/08/05 00:58:48 INFO BwaInterpreter: JMAbuin:: End of sorting. Timing: 28482908686155
16/08/05 00:58:48 INFO BwaInterpreter: JMAbuin:: Total time: 0.009857624766666667 minutes
16/08/05 00:58:48 INFO BwaAlignmentBase: JMAbuin:: application_1470330263545_0007 - SparkBWA_ERR000589_1.filt.fastq-32-NoSort
16/08/05 01:01:31 INFO BwaInterpreter: BwaRDD :: Total of returned lines from RDDs :: 32
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-0.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-1.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-2.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-3.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-4.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-5.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-6.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-7.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-8.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-9.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-10.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-11.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-12.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-13.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-14.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-15.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-16.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-17.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-18.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-19.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-20.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-21.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-22.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-23.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-24.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-25.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-26.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-27.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-28.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-29.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-30.sam
16/08/05 01:01:31 INFO BwaInterpreter: JMAbuin:: SparkBWA:: Returned file ::Output_ERR000589/SparkBWA_ERR000589_1.filt.fastq-32-NoSort-application_1470330263545_0007-31.sam

Asmaa-Ali commented 8 years ago

@jmabuin Please, I need help to solve this problem.

jmabuin commented 8 years ago

Hi @Asmaa-Ali

SparkBWA creates temporary .sam files in the computing nodes (executors) while the program is executing. Can you connect to one of your nodes and check if this temporary files have some content?

This files are stored in the spark temporary directory or in /tmp/, you can find them by using the find command find /tmp/ -name "*.sam"

Also, what is the content of your /Data/HumanBase/ dir? is this content available in all nodes?

linhbngo commented 8 years ago

I run into the same issue, and my temporary .sam files inside /tmp also have no content.

jmabuin commented 8 years ago

Can you please check if it works with SparkBWA 0.2 ?

zargaboy1 commented 7 years ago

Hi, I have tried it with SparkBWA 0.2 and I run also into the same issue. It works fine and then the produced sam files are empty ! The content of /Data/HumanBase/ dir is ivailable for all my worker nodes. Any suggestions, please ?

jmabuin commented 7 years ago

Do you have permissions to write in the tmp folder? also, which versions of Hadoop and Spark are you using?

zargaboy1 commented 7 years ago

Yes I do have thr ights to write in /tmp. I am using spark2.0 and hdfs 2.7.3 . Any ideas welcome !

jmabuin commented 7 years ago

Have you tried to use yarn-cluster instead of yarn-client?

Actually in newer versions of Spark it should be --master yarn --deploy-mode cluster

avapirev commented 7 years ago

Same issue here. I try also to run with the local scheduler (no yarn). Could that be the reason for missing/empty sam files? Does all input data need to be through hdfs? It also complains it can't find the index file and have set up proper permission to all locations. Thank you.

Update: Obviously, one really needs a running hadoop cluster so that the code can work on the data in HDFS. Hence the empty sam files in Spark standalone cluster mode. It would be nice if there were an option for running a spark standalone instance. Hadoop can be a real pain under Torque/PBS job schedulers.

xubo245 commented 7 years ago

I fix. Now SparkBWA-0.2 can run yarn or standalone and output sam file in my local cluster.

avapirev commented 7 years ago

@xubo245 Thanks a lot - the fix works. I can confirm that it also runs in Spark standalone mode (no Hadoop FS)

xubo245 commented 7 years ago

You are welcome.

tushu1232 commented 7 years ago

@xubo245 It is still not working for me in standalone mode.Have you done any more specific changes

xubo245 commented 7 years ago

Yes, I have temporary changes for standalone, but it is not best solutions...

com.github.sparkbwa.BwaAlignmentBase#copyResults

         Configuration conf = new Configuration();
        conf.set("fs.default.name","hdfs://Master:9000/");
        FileSystem fs = FileSystem.get(new URI("hdfs://Master:9000/"),conf);

The Master should be your cluster hostname.

tushu1232 commented 7 years ago

@xubo245 We are running on non-pdfs environment using GPFS. How can we make it general fs.The fs is available on every nodes similar to hdfs

tushu1232 commented 7 years ago

38 Attached the spark run

xubo245 commented 7 years ago

You should replace HDFS code with GPFS API, but I do not known GPFS...

SparkBWA has many many HDFS code...

citiususc / SparkBWA

SparkBWA generates empty sam files #22

38 Attached the spark run