Closed olgabot closed 9 years ago
For the sailfish
errors, the error log shows that those 65 samples ran out of walltime. Here's the tail of one of those files:
=>> PBS: job killed: walltime 7224 exceeded limit 7200
Nodes: tscc-2-10
discarding /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/bin from PATH
prepending /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/envs/olga/bin to PATH
Here's proof that it happens in all 65 of those files:
$ tail /home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/*_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sailfish.out | grep walltime | wc -l
65
Looking at the STAR
logs, it's always a segmentation fault:
head /home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/*_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sam.out | less -S
Example output:
==> /home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/M1_01_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sam.out <==
discarding /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/bin from PATH
prepending /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/envs/olga/bin to PATH
discarding /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/bin from PATH
prepending /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/envs/olga/bin to PATH
/home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/.queue/tmp/.exec8316565503078234537: line 2: 10692 Segmentation fault STAR '--runMode' 'alignReads
Nodes: tscc-2-52
discarding /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/bin from PATH
prepending /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/envs/olga/bin to PATH
==> /home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/M1_02_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sam.out <==
discarding /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/bin from PATH
prepending /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/envs/olga/bin to PATH
discarding /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/bin from PATH
prepending /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/envs/olga/bin to PATH
/home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/.queue/tmp/.exec5690637824564258731: line 2: 9985 Segmentation fault STAR '--runMode' 'alignReads
Nodes: tscc-2-52
discarding /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/bin from PATH
prepending /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/envs/olga/bin to PATH
==> /home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/M1_03_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sam.out <==
discarding /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/bin from PATH
prepending /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/envs/olga/bin to PATH
discarding /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/bin from PATH
prepending /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/envs/olga/bin to PATH
/home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/.queue/tmp/.exec4681241327761770130: line 2: 7400 Segmentation fault STAR '--runMode' 'alignReads
Nodes: tscc-2-12
discarding /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/bin from PATH
prepending /projects/ps-yeolab/software/anaconda-2.1.0_2015-01-20/envs/olga/bin to PATH
Hmm, but a segfault doesn't explain all of the errors:
$ grep Segmentation /home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/*_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sam.out | wc -l
246
262-246 = 16 samples unexplained
By searching for everything that's NOT a segmentation fault, via:
$ grep -v Segmentation /home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/*_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sam.out | grep -v anaconda | grep -v Nodes > singlecell_pnms_pe_v4_error_star_not_segfault.txt
it looks like there's a few queue errors, and the rest are R1 and R2 not matching up properly. Here's some of the non-R1/R2 errors:
/home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/M1_06_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sam.out:/usr/bin/ipcrm: invalid id (196608)
/home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/MSA_16_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sam.out:Can't remove /etc/security/access.conf: No such file or directory, skipping file.
/home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/MSA_18_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sam.out:Can't remove /etc/security/access.conf: No such file or directory, skipping file.
/home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/N4_11_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sam.out:/usr/bin/ipcrm: invalid id (0)
/home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/P2_03_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sam.out:/usr/bin/ipcrm: invalid id (163840)
/home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/P3_02_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sam.out:/usr/bin/ipcrm: invalid id (262144)
Here's the full STAR command for one of the files:
STAR '--runMode' 'alignReads' '--runThreadN' '16' '--genomeDir' '/projects/ps-yeolab/genomes/hg19/star_sjdb' '--genomeLoad' 'LoadAndRemove' '--readFilesIn' '/home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/P9_04_R1.fastq.gz.polyATrim.adapterTrim.rmRep.fastq' '/home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/P9_04_R2.fastq.gz.polyATrim.adapterTrim.rmRep.fastq' '--outSAMunmapped' 'Within' '--outFilterMultimapNmax' '10' '--outFilterMultimapScoreRange' '1' '--outFileNamePrefix' '/home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/P9_04_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sam' '--outSAMattributes' 'All' '--outSAMstrandField intronMotif' '--outStd' 'BAM_SortedByCoordinate' '--outSAMtype' 'BAM' 'SortedByCoordinate' '--outFilterType' 'BySJout' '--outReadsUnmapped' 'Fastx' '--outFilterScoreMin' '10' > /home/obotvinnik/projects/singlecell_pnms/analysis/singlecell_pnms_pe_v4/P9_04_R1.fastq.gz.polyATrim.adapterTrim.rmRep.sam
aha! turns out STAR.scala
requests 8 cores from TSCC, but the STAR
command says to use 16 cores. I've changed this now:
https://github.com/gpratt/gatk/pull/9/files#diff-89a3c229db8cb54aefacdeffca10598cL45
Here's a gist with all the documents: https://gist.github.com/olgabot/160c649786d45920ed09
In the file singlecell_pnms_pe_v4_error_star_sailfish_counts.txt I've counted the number of samples that failed at either the
sailfish quant
orSTAR
stages, and all 262 samples failed at STAR, but only 65 failed atsailfish
.I'm investigating further, playing @gpratt's favorite game of "one of these is not like the other"