statgenetics / seqspark

SEQSpark documentation
https://statgenetics.github.io/seqspark/
Apache License 2.0
18 stars 7 forks source link

java.util.NoSuchElementException when running SingleStudy #2

Open 1tilly opened 6 years ago

1tilly commented 6 years ago

Hi all, I just tested your project but sadly got an exception[0]. For testing purposes I used the chromosome 9 of the 1000genome project(phase 3) [1] and the corresponding panel[2]. I renamed the vcf and panel for personal organization preferences. I have a spark(v2.1.1) and hdfs(v2.8.0) standalone on my workstation installed. Both work just fine (I'm currently using them via pyspark). Attached you'll find my config[3] and the command[4] I'm running on the cli.

Best, 1Tilly

[0]

ERROR SingleStudy$: Something went wrong, exit java.util.NoSuchElementException: None.get at scala.None$.get(Option.scala:347) at scala.None$.get(Option.scala:345) at org.dizhang.seqspark.worker.Samples$$anonfun$9.apply(Samples.scala:148) at org.dizhang.seqspark.worker.Samples$$anonfun$9.apply(Samples.scala:148) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at org.dizhang.seqspark.worker.Samples$.titv(Samples.scala:148) at org.dizhang.seqspark.worker.QualityControl$.cleanVCF(QualityControl.scala:110) at org.dizhang.seqspark.SingleStudy$.run(SingleStudy.scala:171) at org.dizhang.seqspark.SingleStudy$.main(SingleStudy.scala:57) at org.dizhang.seqspark.SingleStudy.main(SingleStudy.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 17/07/12 16:09:26 INFO SparkContext: Invoking stop() from shutdown hook

[1] ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.chr9.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz

[2] ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/integrated_call_samples_v3.20130502.ALL.panel

[3]

seqspark {
  project = demo2
  pipeline = [ "qualityControl" ]
  input {
      genotype.path = "chr9ph31k.vcf.gz"
      phenotype.path = "ph31k.panel"

  }
  qualityControl {
      genotypes = ["DP >= 1 and GQ >= 1"]
      summaries = ["pca", "titv"]
    }
}

[4]./bin/seqspark SingleStudy conf/demo2.conf

zhangdi-devel commented 6 years ago

Hi Tobias,

I’ll update the document about how to prepare the phenotype file. Currently, the sample ID column should be named ‘iid’. I’ll also test the 1000genomes data you used and get back to you later.

Di

On Jul 12, 2017, at 10:31 AM, Tobias Tilly notifications@github.com wrote:

ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/integrated_call_samples_v3.20130502.ALL.panel ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/integrated_call_samples_v3.20130502.ALL.panel