Closed vogelwk closed 9 years ago
Can you tell me the output of this:
bwa 2>&1 | grep Version
scratch the previous. can you update to the latest revision and retry? thanks for reporting.
Here you go (now with v3.4 of setup tools) also tried tried v 2.01 with a clean install of python 2.7.6 earlier today:
bwameth.py --reference ref.fa t_R1.fastq.gz t_R2.fastq.gz -t 12
running: bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p -R '@RG ID:t_R SM:t_R' -t 12 ref.fa.bwameth.c2t '</home/pi/vogelw/bin/python /home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz'
writing to:
samtools view -bS - | samtools sort - bwa-meth
/home/pi/vogelw/bin/python: can't find 'main' module in '/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py'
[gzclose] buffer error
cmd was:bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p -R '@RG ID:t_R SM:t_R' -t 12 ref.fa.bwameth.c2t '</home/pi/vogelw/bin/python /home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz'
Traceback (most recent call last):
File "/home/pi/vogelw/bin/bwameth.py", line 5, in
File "/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py", line 579, in main
File "/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py", line 258, in bwa_mem
File "/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py", line 286, in as_bam
File "/home/pi/vogelw/lib/python2.7/site-packages/toolshed-0.3.6-py2.7.egg/toolshed/files.py", line 252, in reader for toks in line_gen: File "/home/pi/vogelw/lib/python2.7/site-packages/toolshed-0.3.6-py2.7.egg/toolshed/files.py", line 53, in process_iter raise ProcessException(cmd) toolshed.files.ProcessException: bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p -R '@RG ID:t_R SM:t_R' -t 12 ref.fa.bwameth.c2t '</home/pi/vogelw/bin/python /home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz'
can you try this once more. i hadn't actually pushed.
bwameth.py --reference ref.fa t_R1.fastq.gz t_R2.fastq.gz -t 12
running: bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p -R '@RG ID:t_R SM:t_R' -t 12 ref.fa.bwameth.c2t '</home/pi/vogelw/bin/python /raid1/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.10-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz'
writing to:
samtools view -bS - | samtools sort - bwa-meth
converting reads in t_R1.fastq.gz,t_R2.fastq.gz
[M::main_mem] read 92726 sequences (9365326 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 44942, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (109, 137, 175)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 307)
[M::mem_pestat] mean and std.dev: (145.03, 48.64)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 373)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 92726 reads in 10.016 CPU sec, 0.863 real sec
[main] Version: 0.7.8-r455
[main] CMD: bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p -R @RG ID:t_R SM:t_R -t 12 ref.fa.bwameth.c2t </home/pi/vogelw/bin/python /raid1/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.10-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz
[main] Real time: 8.957 sec; CPU: 10.384 sec
Traceback (most recent call last):
File "/home/pi/vogelw/bin/bwameth.py", line 5, in
samtools flagstat bwa-meth.bam 92221 + 579 in total (QC-passed reads + QC-failed reads) 0 + 0 duplicates 92181 + 576 mapped (99.96%:99.48%) 92221 + 579 paired in sequencing 46118 + 289 read1 46103 + 290 read2 91789 + 0 properly paired (99.53%:0.00%) 92142 + 570 with itself and mate mapped 39 + 6 singletons (0.04%:1.04%) 0 + 0 with mate mapped to a different chr 0 + 0 with mate mapped to a different chr (mapQ>=5)
Hmm, I'm not sure how it's getting this output:
self.flag = int(self.flag)
ValueError: invalid literal for int() with base 10: '0+0k 10760+0io 5pf+0w'
18.259u 0.530s 0:09.73 193.0% 0+0k 10864+0io 5pf+0w
It must be the last line from what bwa mem
is putting out somehow because as your flagstat shows, you have all of the data. I don't know where that would be coming from.
I will look into this further tomorrow, but I am not able to recreate it. I even tried that exact revision of bwa.
what does
type bwa
show?
$ type bwa bwa is hashed (/data1/opt/bin/bwa)
Can you tell me the output of this:
bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p ref.fa.bwameth.c2t '</home/pi/vogelw/bin/python /raid1/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.10-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz' | tail
bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p ref.fa.bwameth.c2t '</home/pi/vogelw/bin/python /raid1/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.10-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz' | tail converting reads in t_R1.fastq.gz,t_R2.fastq.gz [M::main_mem] read 92726 sequences (9365326 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 44942, 0, 0) [M::mem_pestat] skip orientation FF as there are not enough pairs [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat](25, 50, 75) percentile: (109, 137, 175) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 307) [M::mem_pestat] mean and std.dev: (145.03, 48.64) [M::mem_pestat] low and high boundaries for proper pairs: (1, 373) [M::mem_pestat] skip orientation RF as there are not enough pairs [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_process_seqs] Processed 92726 reads in 8.495 CPU sec, 8.502 real sec [main] Version: 0.7.8-r455 HISEQ:105:C2UE1ACXX:3:2316:20506:100449 99 fchrREF 61007 60 101M = 61122 216 AATTTTTTTTTTAAGGGTGAGGTTTTTAGTTAGTTGTTGATAGTTAAGAATGGTAATTAATTTTTTGAAAGATATTGTTTGATTTTTGTTTTTATGTGGTT BBBFFFFFFFFFFF<<7BFFF7FBFFIF7BFF7BF0BB7<BF0<FFF<BFFB7BFFFFFFFFFFFF0<BBBBBBBB0<BB'<BBFFF'<BFFFBF0B007< NM:i:0 MD:Z:101 AS:i:101 XS:i:0 YS:Z:AATTTTTTTTTTAAGGGTGAGGTTTTTAGTTAGTTGTTGATAGTTAAGAATGGTAATTAATTTTTTGAAAGATATTGTTTGATTTTTGTTTTTACGTGGTT YC:Z:CT HISEQ:105:C2UE1ACXX:3:2316:20506:100449 147 fchrREF 61122 60 101M = 61007 -216 TTGAGTTTTTATATTTATGATGTATTATTGTAGTTATTATGTTGGTAAAGGGTTTTTTGAGTTTTTTGTTATGTTGATTTGAGTTTTAGAGTATATATTTT BBBBFFFFFFFFBFFFFFFFFFFFFFFFBBFFFFFFFFFFIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFIFFFFFFFFFFBBB NM:i:0 MD:Z:101 AS:i:101 XS:i:0 YS:Z:AAAATATATACTCTAAAACTCAAATCAACATAACAAAAAACTCAAAAAACCCTTTACCAACATAATAACTACAATAATACATCATAAATATAAAAACTCAA YC:Z:GA HISEQ:105:C2UE1ACXX:3:2316:11659:100715 99 fchrREF 27015 60 101M = 27028 114 TATGGTATTGTGTTTTGTTTTTTTGGTATTTGTGAGGGTAGAATTGTTTTTGGGTTTTAATTTTTTTAAGTATGGGAGGTTGTGGGTATAAGGTGGATTTT <BB<<BFFFBFFFFFIBFFIIIIIB<BFFFI<B<FFFFIFFFFFF<FFFII777BBFFFFFFFFFFFBB<BBB7777<<B70<<<<BB<BB<0<BBBBBFF NM:i:0 MD:Z:101 AS:i:101 XS:i:0 YS:Z:TATGGTATTGTGTTTTGTTTTTTTGGTATTTGTGAGGGTAGAATTGTTTTTGGGTTTTAATTTTTTTAAGTATGGGAGGTTGTGGGTATAAGGTGGATTTT YC:Z:CT HISEQ:105:C2UE1ACXX:3:2316:11659:100715 147 fchrREF 27028 60 101M = 27015 -114 TTTTTTTTTTTGGTATTTGTGAGGGTAGAATTGTTTTTGGGTTTTAATTTTTTTAAGTATGGGAGGTTGTGGGTATAAGGTGGATTTTTTTTTTTTTTATT B<<0B<BBBBB<<<'BB<0000BB<0''<070'<BBBBBBBBBBBBB<BBBBB<<B<<<<BB<0<B<<7<BB<777BB'<'B<BIIIIFFFFFFFFFFBBB NM:i:1 MD:Z:3G97 AS:i:98 XS:i:19 YS:Z:AATAAAAAAAAAAAAAATCCACCTTATACCCACAACCTCCCATACTTAAAAAAATTAAAACCCAAAAACAATTCTACCCTCACAAATACCAAAAAAAAAAA YC:Z:GA HISEQ:105:C2UE1ACXX:3:2316:18269:100874 83 rchrREF 93187 60 101M = 93117 -171 AACATAAACCAAAACCTTATAAATAAACAACATCTCCTACACTTCTACAAAAAAAAACAAAAACCCCAACCAAACCATTCCAAAACAAAATTCCTAAAAAC #B<FBBB7''B<00''BBB0FBBBBB<0BB07BBBBBBB<<<B7777'FFFFFIIIFBFFFBIFFFBFB0BFFFBFFFB<<IIFF0IIFFFFBBFFFFBB< NM:i:1 MD:Z:0C100 AS:i:100 XS:i:0 YS:Z:GTTTTTAGGAATTTTGTTTTGGAACGGTTTGGTCGGGGTTTTCGTTTTTTTTTGTAGAAGTGTAGGAGATGTTGTTTATTTATAAGGTTTTGGTTTATGCT YC:Z:CT HISEQ:105:C2UE1ACXX:3:2316:18269:100874 163 rchrREF 93117 60 101M = 93187 171 ACCCTACAAAAACAACACAAATACAACCATAACTACAAACAACCTTACACACACCCTACTCACACTTCCACACATAAACCAAAACCTTATAAATAAACAAC BBBFFFFFFFFFFIIIIIIIIIIIIIIIIFIIIIFIIIIIFFIIIIIIFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:101 AS:i:101 XS:i:0 YS:Z:ACCCTACAAAAACAACACGAATACAACCGTAACTACGAACGACCTTACGCACGCCCTACTCGCACTTCCACGCATAAACCAAAACCTTATAAATAAACAAC YC:Z:GA HISEQ:105:C2UE1ACXX:3:2316:7508:101058 99 fchrREF 29997 60 101M = 30034 138 AGGATTTTTTGAGGTGGTTGGTTATAGGTGAGGGTATTTGGGGTTTTGTTTAGAATTTGGATTGTTTGATTTGTGAGTTTTTGTGGTTTTTTTGTTATTTT BBBFFFFFFFB<BBFFFFF00BFFIF<0BBFFBFFF<BF007<BBFF7BBFB0<BBFF70<BF<BBB7<BBF7<7B<<BBFF7B00<7BBFB'0<<BFFFF NM:i:1 MD:Z:92G8 AS:i:98 XS:i:23 YS:Z:AGGATTTTTTGAGGTGGTTGGTTATAGGCGAGGGTATTTGGGGTTTTGTTTAGAATTTGGATCGTTTGATTTGTGAGTTTTTGTGGTTTTTCTGTTATTTT YC:Z:CT HISEQ:105:C2UE1ACXX:3:2316:7508:101058 147 fchrREF 30034 60 101M = 29997 -138 TTGGGGTTTTGTTTAGAATTTGGATTGTTTGATTTGTGAGTTTTTGTGGTTTTTTGGTTATTTTGATGTGAATATGTTAGGGGTTGAGTAGAAAGTAGTAT FB7000FBBBBBFFBBBBFFFFBBBFFFFFFBFFFFFFFFFFFIFBB7FBFFFFFFFFF<FFFFFIIFFFFFFFFFFF<IIIFFFIFFFFFFFFFFFFBBB NM:i:0 MD:Z:101 AS:i:101 XS:i:20 YS:Z:ATACTACTTTCTACTCAACCCCTAACATATTCACATCAAAATAACCGAAAAACCACAAAAACTCACAAATCAAACGATCCAAATTCTAAACAAAACCCCAA YC:Z:GA HISEQ:105:C2UE1ACXX:3:2316:18777:101069 83 rchrREF 62392 60 30S71M = 62393 -70 ACTAAAATTCAAACATATACTCTTCCAATCTAACTTCCTAATTAATCCCATTCCATAACAACTTCTACTACCTCTATCCCATTTCTCCCATAAAATTAATA <0BBBBBB<0BB70BB<<7'<0B70'FFB7FBB0FB7<FFFFBFBB7'BFB0'0FBFB'FFBFFBFBBFFB<B<FB<00<FBFFBB000FFFFFFFFFBBB NM:i:0 MD:Z:71 AS:i:71 XS:i:0 YS:Z:TATTAATTTTATGGGAGAAATGGGATAGAGGTAGTAGAAGCTGTTATGGAATGGGATTAATTAGGAAGTTAGATCGGAAGAGCACACGTCTGAACTCCAGT YC:Z:CT HISEQ:105:C2UE1ACXX:3:2316:18777:101069 163 rchrREF 62393 60 70M31S = 62392 70 AACTTCCTAATTAATCCCATTCCATAACAACTTCTACTACCTCTATCCCATTTCTCCCATAAAATTAATAAAATCAAAAAAACATCATATAAAAAAAAAAT BBBFFFFF<FFFFIBFFIFIFIFBFFFFFFIFIFFBFFFBBFBBBFFIIIFI[main] CMD: bwa mem -T 40 -B 2 -L 10 -CM -U 100IFFFFFFFFFIIFFBB0<F<FBFBFIFBFBF<00777<B<77BBB#### NM:i:0 MD:Z:70 AS:i:70 XS:i:0 YS:Z:AACTTCCTAATTAATCCCATTCCATAACAGCTTCTACTACCTCTATCCCATTTCTCCCATAAAATTAATAAGATCGGAAGAGCGTCGTGTAGGGAAAGAGT YC:Z:GA -p ref.fa.bwameth.c2t </home/pi/vogelw/bin/python /raid1/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.10-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz [main] Real time: 10.564 sec; CPU: 8.744 sec 10.527u 0.212s 0:10.57 101.5% 0+0k 128+0io 1pf+0w
I wonder were that last line is coming from:
10.527u 0.212s 0:10.57 101.5% 0+0k 128+0io 1pf+0w
It seems to be coming from bwa mem and going to stdout.
Morning Brent,
That line is being generated by the csh environment 'set time=10' , in this instance it is being set by the system 'std.cshrc' . 'unset' ing it or running in Bash shell generates the same errors--with out this cpu timing line.
Morning. So you still see this:
self.flag = int(self.flag)
ValueError: invalid literal for int() with base 10: '0+0k 10760+0io 5pf+0w'
? Because that is trying to call int() on the timing data. Thanks for your persistence.
Yep, both in a Bash shell test and in C shell tests (w/ 'unset time').
ok, can you try this:
bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p ref.fa.bwameth.c2t '</home/pi/vogelw/bin/python /raid1/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.10-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz' > t.sam
tail t.sam
and
/home/pi/vogelw/bin/python /raid1/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.10-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz > t.fastq
tail t.fastq
and show the full output from both? after this, I don't have much left for ideas. You could also try
rm ref.fa.bwa*
and reindexing...
[vogelw@mus1 example]% bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p ref.fa.bwameth.c2t '</home/pi/vogelw/bin/python /raid1/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.10-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz' > t.sam converting reads in t_R1.fastq.gz,t_R2.fastq.gz [M::main_mem] read 92726 sequences (9365326 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 44942, 0, 0) [M::mem_pestat] skip orientation FF as there are not enough pairs [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat](25, 50, 75) percentile: (109, 137, 175) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 307) [M::mem_pestat] mean and std.dev: (145.03, 48.64) [M::mem_pestat] low and high boundaries for proper pairs: (1, 373) [M::mem_pestat] skip orientation RF as there are not enough pairs [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_process_seqs] Processed 92726 reads in 8.478 CPU sec, 8.522 real sec [main] Version: 0.7.8-r455 [main] CMD: bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p ref.fa.bwameth.c2t </home/pi/vogelw/bin/python /raid1/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.10-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz [main] Real time: 11.256 sec; CPU: 8.737 sec [vogelw@mus1 example]% tail t.sam HISEQ:105:C2UE1ACXX:3:2316:20506:100449 99 fchrREF 61007 60 101M 61122 216 AATTTTTTTTTTAAGGGTGAGGTTTTTAGTTAGTTGTTGATAGTTAAGAATGGTAATTAATTTTTTGAAAGATATTGTTTGATTTTTGTTTTTATGTGGTT BBBFFFFFFFFFFF<<7BFFF7FBFFIF7BFF7BF0BB7<BF0<FFF<BFFB7BFFFFFFFFFFFF0<BBBBBBBB0<BB'<BBFFF'<BFFFBF0B007< NM:i:0 MD:Z:10AS:i:101 XS:i:0 YS:Z:AATTTTTTTTTTAAGGGTGAGGTTTTTAGTTAGTTGTTGATAGTTAAGAATGGTAATTAATTTTTTGAAAGATATTGTTTGATTTTTGTTTTTACGTGGTT YC:Z:CT HISEQ:105:C2UE1ACXX:3:2316:20506:100449 147 fchrREF 61122 60 101M 61007 -216 TTGAGTTTTTATATTTATGATGTATTATTGTAGTTATTATGTTGGTAAAGGGTTTTTTGAGTTTTTTGTTATGTTGATTTGAGTTTTAGAGTATATATTTT BBBBFFFFFFFFBFFFFFFFFFFFFFFFBBFFFFFFFFFFIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFIFFFFFFFFFFBBB NM:i:0 MD:Z:10AS:i:101 XS:i:0 YS:Z:AAAATATATACTCTAAAACTCAAATCAACATAACAAAAAACTCAAAAAACCCTTTACCAACATAATAACTACAATAATACATCATAAATATAAAAACTCAA YC:Z:GA HISEQ:105:C2UE1ACXX:3:2316:11659:100715 99 fchrREF 27015 60 101M 27028 114 TATGGTATTGTGTTTTGTTTTTTTGGTATTTGTGAGGGTAGAATTGTTTTTGGGTTTTAATTTTTTTAAGTATGGGAGGTTGTGGGTATAAGGTGGATTTT <BB<<BFFFBFFFFFIBFFIIIIIB<BFFFI<B<FFFFIFFFFFF<FFFII777BBFFFFFFFFFFFBB<BBB7777<<B70<<<<BB<BB<0<BBBBBFF NM:i:0 MD:Z:10AS:i:101 XS:i:0 YS:Z:TATGGTATTGTGTTTTGTTTTTTTGGTATTTGTGAGGGTAGAATTGTTTTTGGGTTTTAATTTTTTTAAGTATGGGAGGTTGTGGGTATAAGGTGGATTTT YC:Z:CT HISEQ:105:C2UE1ACXX:3:2316:11659:100715 147 fchrREF 27028 60 101M 27015 -114 TTTTTTTTTTTGGTATTTGTGAGGGTAGAATTGTTTTTGGGTTTTAATTTTTTTAAGTATGGGAGGTTGTGGGTATAAGGTGGATTTTTTTTTTTTTTATT B<<0B<BBBBB<<<'BB<0000BB<0''<070'<BBBBBBBBBBBBB<BBBBB<<B<<<<BB<0<B<<7<BB<777BB'<'B<BIIIIFFFFFFFFFFBBB NM:i:1 MD:Z:3G97 AS:i:98 XS:i:19 YS:Z:AATAAAAAAAAAAAAAATCCACCTTATACCCACAACCTCCCATACTTAAAAAAATTAAAACCCAAAAACAATTCTACCCTCACAAATACCAAAAAAAAAAA YC:Z:GA HISEQ:105:C2UE1ACXX:3:2316:18269:100874 83 rchrREF 93187 60 101M 93117 -171 AACATAAACCAAAACCTTATAAATAAACAACATCTCCTACACTTCTACAAAAAAAAACAAAAACCCCAACCAAACCATTCCAAAACAAAATTCCTAAAAAC #B<FBBB7''B<00''BBB0FBBBBB<0BB07BBBBBBB<<<B7777'FFFFFIIIFBFFFBIFFFBFB0BFFFBFFFB<<IIFF0IIFFFFBBFFFFBB< NM:i:1 MD:Z:0C100 AS:i:100 XS:i:0 YS:Z:GTTTTTAGGAATTTTGTTTTGGAACGGTTTGGTCGGGGTTTTCGTTTTTTTTTGTAGAAGTGTAGGAGATGTTGTTTATTTATAAGGTTTTGGTTTATGCT YC:Z:CT HISEQ:105:C2UE1ACXX:3:2316:18269:100874 163 rchrREF 93117 60 101M 93187 171 ACCCTACAAAAACAACACAAATACAACCATAACTACAAACAACCTTACACACACCCTACTCACACTTCCACACATAAACCAAAACCTTATAAATAAACAAC BBBFFFFFFFFFFIIIIIIIIIIIIIIIIFIIIIFIIIIIFFIIIIIIFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:10AS:i:101 XS:i:0 YS:Z:ACCCTACAAAAACAACACGAATACAACCGTAACTACGAACGACCTTACGCACGCCCTACTCGCACTTCCACGCATAAACCAAAACCTTATAAATAAACAAC YC:Z:GA HISEQ:105:C2UE1ACXX:3:2316:7508:101058 99 fchrREF 29997 60 101M 30034 138 AGGATTTTTTGAGGTGGTTGGTTATAGGTGAGGGTATTTGGGGTTTTGTTTAGAATTTGGATTGTTTGATTTGTGAGTTTTTGTGGTTTTTTTGTTATTTT BBBFFFFFFFB<BBFFFFF00BFFIF<0BBFFBFFF<BF007<BBFF7BBFB0<BBFF70<BF<BBB7<BBF7<7B<<BBFF7B00<7BBFB'0<<BFFFF NM:i:1 MD:Z:92G8 AS:i:98 XS:i:23 YS:Z:AGGATTTTTTGAGGTGGTTGGTTATAGGCGAGGGTATTTGGGGTTTTGTTTAGAATTTGGATCGTTTGATTTGTGAGTTTTTGTGGTTTTTCTGTTATTTT YC:Z:CT HISEQ:105:C2UE1ACXX:3:2316:7508:101058 147 fchrREF 30034 60 101M 29997 -138 TTGGGGTTTTGTTTAGAATTTGGATTGTTTGATTTGTGAGTTTTTGTGGTTTTTTGGTTATTTTGATGTGAATATGTTAGGGGTTGAGTAGAAAGTAGTAT FB7000FBBBBBFFBBBBFFFFBBBFFFFFFBFFFFFFFFFFFIFBB7FBFFFFFFFFF<FFFFFIIFFFFFFFFFFF<IIIFFFIFFFFFFFFFFFFBBB NM:i:0 MD:Z:10AS:i:101 XS:i:20 YS:Z:ATACTACTTTCTACTCAACCCCTAACATATTCACATCAAAATAACCGAAAAACCACAAAAACTCACAAATCAAACGATCCAAATTCTAAACAAAACCCCAA YC:Z:GA HISEQ:105:C2UE1ACXX:3:2316:18777:101069 83 rchrREF 62392 60 30S71M 62393 -70 ACTAAAATTCAAACATATACTCTTCCAATCTAACTTCCTAATTAATCCCATTCCATAACAACTTCTACTACCTCTATCCCATTTCTCCCATAAAATTAATA <0BBBBBB<0BB70BB<<7'<0B70'FFB7FBB0FB7<FFFFBFBB7'BFB0'0FBFB'FFBFFBFBBFFB<B<FB<00<FBFFBB000FFFFFFFFFBBB NM:i:0 MD:Z:71AS:i:71 XS:i:0 YS:Z:TATTAATTTTATGGGAGAAATGGGATAGAGGTAGTAGAAGCTGTTATGGAATGGGATTAATTAGGAAGTTAGATCGGAAGAGCACACGTCTGAACTCCAGT YC:Z:CT HISEQ:105:C2UE1ACXX:3:2316:18777:101069 163 rchrREF 62393 60 70M31S 62392 70 AACTTCCTAATTAATCCCATTCCATAACAACTTCTACTACCTCTATCCCATTTCTCCCATAAAATTAATAAAATCAAAAAAACATCATATAAAAAAAAAAT BBBFFFFF<FFFFIBFFIFIFIFBFFFFFFIFIFFBFFFBBFBBBFFIIIFIIFFFFFFFFFIIFFBB0<F<FBFBFIFBFBF<00777<B<77BBB#### NM:i:0 MD:Z:70AS:i:70 XS:i:0 YS:Z:AACTTCCTAATTAATCCCATTCCATAACAGCTTCTACTACCTCTATCCCATTTCTCCCATAAAATTAATAAGATCGGAAGAGCGTCGTGTAGGGAAAGAGT YC:Z:GA [vogelw@mus1 example]% /home/pi/vogelw/bin/python /raid1/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.10-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz > t.fastq converting reads in t_R1.fastq.gz,t_R2.fastq.gz [vogelw@mus1 example]% tail t.fastq + BBBFFFFFFFFFFFFIFFFIII<FFFFFFFFFFFIIFFFFF<FFFFFFFFFBF7BBFIFFFFFFFFFFFBFFFFFFBBBFFFFBBBBFFBBBBBF0007BF @HISEQ:105:C2UE1ACXX:3:2316:18777:101069 YS:Z:TATTAATTTTATGGGAGAAATGGGATAGAGGTAGTAGAAGCTGTTATGGAATGGGATTAATTAGGAAGTTAGATCGGAAGAGCACACGTCTGAACTCCAGT YC:Z:CT TATTAATTTTATGGGAGAAATGGGATAGAGGTAGTAGAAGTTGTTATGGAATGGGATTAATTAGGAAGTTAGATTGGAAGAGTATATGTTTGAATTTTAGT + BBBFFFFFFFFF000BBFFBF<00<BF<B<BFFBBFBFFBFF'BFBF0'0BFB'7BBFBFFFF<7BF0BBF7BFF'07B0<'7<<BB07BB0<BBBBBB0< @HISEQ:105:C2UE1ACXX:3:2316:18777:101069 YS:Z:AACTTCCTAATTAATCCCATTCCATAACAGCTTCTACTACCTCTATCCCATTTCTCCCATAAAATTAATAAGATCGGAAGAGCGTCGTGTAGGGAAAGAGT YC:Z:GA AACTTCCTAATTAATCCCATTCCATAACAACTTCTACTACCTCTATCCCATTTCTCCCATAAAATTAATAAAATCAAAAAAACATCATATAAAAAAAAAAT + BBBFFFFF<FFFFIBFFIFIFIFBFFFFFFIFIFFBFFFBBFBBBFFIIIFIIFFFFFFFFFIIFFBB0<F<FBFBFIFBFBF<00777<B<77BBB####
[vogelw@mus1 example]% bwameth.py index ref.fa
converting c2t in ref.fa to ref.fa.bwameth.c2t
indexing: ref.fa.bwameth.c2t
[bwa_index] Pack FASTA... 0.01 sec
[bwa_index] Construct BWT for the packed sequence...
[BWTIncCreate] textLength=3600000, availableWord=4573648
[bwa_index] 0.71 seconds elapse.
[bwa_index] Update BWT... 0.01 sec
[bwa_index] Pack forward-only FASTA... 0.01 sec
[bwa_index] Construct SA from BWT and Occ... 0.19 sec
[main] Version: 0.7.8-r455
[main] CMD: bwa index -a bwtsw ref.fa.bwameth.c2t
[main] Real time: 1.046 sec; CPU: 0.939 sec
[vogelw@mus1 example]% bwameth.py --reference ref.fa t_R1.fastq.gz t_R2.fastq.gz -t 12
running: bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p -R '@RG ID:t_R SM:t_R' -t 12 ref.fa.bwameth.c2t '</home/pi/vogelw/bin/python /raid1/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.10-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz'
writing to:
samtools view -bS - | samtools sort - bwa-meth
converting reads in t_R1.fastq.gz,t_R2.fastq.gz
[M::main_mem] read 92726 sequences (9365326 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 44942, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (109, 137, 175)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 307)
[M::mem_pestat] mean and std.dev: (145.03, 48.64)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 373)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 92726 reads in 8.769 CPU sec, 0.748 real sec
[main] Version: 0.7.8-r455
[main] CMD: bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p -R @RG ID:t_R SM:t_R -t 12 ref.fa.bwameth.c2t </home/pi/vogelw/bin/python /raid1/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.10-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz
[main] Real time: 8.309 sec; CPU: 8.933 sec
Traceback (most recent call last):
File "/home/pi/vogelw/bin/bwameth.py", line 5, in
will you try again with latest commit. I had another guess. (d0c660c)
Brent, To make sure there wasn't anything too badly miss-configured, I created a test environment as follows: w/ gcc 4.8.2 install location: $HOME/opt
Install: python 2.7.6; then install: setuptools-3.4.3, nose-1.3.1, toolshed-0.3.6, git clone: bwa-meth
Here are the results:
% bwameth.py --reference ref.fa t_R1.fastq.gz t_R2.fastq.gz -t 12
running: bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p -R '@RG ID:t_R SM:t_R' -t 12 ref.fa.bwameth.c2t '<//home/pi/vogelw/bin/python /raid1/home/pi/vogelw/opt/lib/python2.7/site-packages/bwameth-0.10-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz'
writing to:
samtools view -bS - | samtools sort - bwa-meth
converting reads in t_R1.fastq.gz,t_R2.fastq.gz
[M::main_mem] read 92726 sequences (9365326 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 44942, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat](25, 50, 75) percentile: (109, 137, 175)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 307)
[M::mem_pestat] mean and std.dev: (145.03, 48.64)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 373)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 92726 reads in 9.712 CPU sec, 0.850 real sec
[main] Version: 0.7.8-r455
[main] CMD: bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p -R @RG ID:t_R SM:t_R -t 12 ref.fa.bwameth.c2t <//home/pi/vogelw/bin/python /raid1/home/pi/vogelw/opt/lib/python2.7/site-packages/bwameth-0.10-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz
[main] Real time: 8.727 sec; CPU: 10.080 sec
Traceback (most recent call last):
File "//home/pi/vogelw/bin/bwameth.py", line 5, in
@vogelwk maybe the recent changes to bwa-meth and toolshed (be sure to update to latest from pypi) will address this.
index worked, align failed see below. python 2.7.6, Centos 6.2, setuptools 3.3,
note: lib/python2.7/site-packages/ contains only a single file: bwameth-0.09-py2.7.egg
It does not contain a directory so named, etc: bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py
ERROR Report:
bwameth.py --reference ref.fa t_R1.fastq.gz t_R2.fastq.gz -t 12 running: bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p -R '@RG ID:t_R SM:t_R' -t 12 ref.fa.bwameth.c2t '</home/pi/vogelw/bin/python /home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz' writing to: samtools view -bS - | samtools sort - bwa-meth /home/pi/vogelw/bin/python: can't find 'main' module in '/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py' [gzclose] buffer error cmd was:bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p -R '@RG ID:t_R SM:t_R' -t 12 ref.fa.bwameth.c2t '</home/pi/vogelw/bin/python /home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz' Traceback (most recent call last): File "/home/pi/vogelw/bin/bwameth.py", line 5, in
pkg_resources.run_script('bwameth==0.09', 'bwameth.py')
File "build/bdist.linux-x86_64/egg/pkg_resources.py", line 528, in run_script
File "build/bdist.linux-x86_64/egg/pkg_resources.py", line 1401, in run_script
File "/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py", line 582, in
File "/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py", line 579, in main
File "/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py", line 258, in bwa_mem
File "/home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py", line 286, in as_bam
File "/home/pi/vogelw/lib/python2.7/site-packages/toolshed-0.3.6-py2.7.egg/toolshed/files.py", line 252, in reader for toks in line_gen: File "/home/pi/vogelw/lib/python2.7/site-packages/toolshed-0.3.6-py2.7.egg/toolshed/files.py", line 53, in process_iter raise ProcessException(cmd) toolshed.files.ProcessException: bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p -R '@RG ID:t_R SM:t_R' -t 12 ref.fa.bwameth.c2t '</home/pi/vogelw/bin/python /home/pi/vogelw/lib/python2.7/site-packages/bwameth-0.09-py2.7.egg/EGG-INFO/scripts/bwameth.py c2t t_R1.fastq.gz t_R2.fastq.gz'