NationalGenomicsInfrastructure / piper

A genomics pipeline build on top of the GATK Queue framework
9 stars 9 forks source link

sthlm2UUSNP problem with dual-indexing #20

Closed vezzi closed 10 years ago

vezzi commented 10 years ago

Hej Johan, is looks like that sthl2UUSNP does not like dual indexing in the name.

This is how one of the M.Kaller_14_06 samples looks like

tree P1171_102/
P1171_102/
`-- A
    `-- 140702_AC41A2ANXX
        |-- P1171_102_ATTCAGAA-CCTATCCT_L001_R1_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L001_R2_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L002_R1_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L002_R2_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L003_R1_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L003_R2_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L004_R1_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L004_R2_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L005_R1_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L005_R2_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L006_R1_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L006_R2_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L007_R1_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L007_R2_001.fastq.gz
        |-- P1171_102_ATTCAGAA-CCTATCCT_L008_R1_001.fastq.gz
        `-- P1171_102_ATTCAGAA-CCTATCCT_L008_R2_001.fastq.gz

when I try to convert this project/sample in UUSNPSEQ format I get the following:

sthlm2UUSNP -i /proj/a2010002/nobackup/NGI/analysis_ready/DATA/M.Kaller_14_06/ -o /proj/a2010002/nobackup/NGI/analysis_ready/ANALYSIS/M.Kaller_14_06_UUSNPException in thread "main" java.lang.IllegalArgumentException: requirement failed: Just one sample hit should be possible for regexp, found: 0
        at scala.Predef$.require(Predef.scala:233)
        at molmed.apps.Sthlm2UUSNP$.parseSampleInfoFromFileHierarchy(Sthlm2UUSNP.scala:164)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1$$anonfun$apply$1$$anonfun$apply$2$$anonfun$9.apply(Sthlm2UUSNP.scala:241)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1$$anonfun$apply$1$$anonfun$apply$2$$anonfun$9.apply(Sthlm2UUSNP.scala:240)
        at scala.collection.immutable.Stream.map(Stream.scala:376)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1$$anonfun$apply$1$$anonfun$apply$2.apply(Sthlm2UUSNP.scala:240)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1$$anonfun$apply$1$$anonfun$apply$2.apply(Sthlm2UUSNP.scala:234)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1$$anonfun$apply$1.apply(Sthlm2UUSNP.scala:234)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1$$anonfun$apply$1.apply(Sthlm2UUSNP.scala:233)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1.apply(Sthlm2UUSNP.scala:233)
        at molmed.apps.Sthlm2UUSNP$$anonfun$runApp$1.apply(Sthlm2UUSNP.scala:232)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
        at molmed.apps.Sthlm2UUSNP$.runApp(Sthlm2UUSNP.scala:232)
        at molmed.apps.Sthlm2UUSNP$$anonfun$4.apply(Sthlm2UUSNP.scala:39)
        at molmed.apps.Sthlm2UUSNP$$anonfun$4.apply(Sthlm2UUSNP.scala:37)
        at scala.Option.map(Option.scala:145)
        at molmed.apps.Sthlm2UUSNP$delayedInit$body.apply(Sthlm2UUSNP.scala:37)
        at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
        at scala.App$$anonfun$main$1.apply(App.scala:71)
        at scala.App$$anonfun$main$1.apply(App.scala:71)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
        at scala.App$class.main(App.scala:71)
        at molmed.apps.Sthlm2UUSNP$.main(Sthlm2UUSNP.scala:16)
        at molmed.apps.Sthlm2UUSNP.main(Sthlm2UUSNP.scala)

if I remove the double indexing leaving only the first part of it the tool works fine.

It is probably only a matter of change a reg expr.

johandahlberg commented 10 years ago

Yes. This is a problematic regexp. I'll fix it.

johandahlberg commented 10 years ago

This is fixed in 094cafd.