Closed superbobry closed 10 years ago
Actually, this anomaly is related to the behaviour of nopen
, discused in #7.
can you update to the latest version of toolshed and let me know if you still see it?
I've update to the latest toolshed 0.3.7. Now I have a different trace, even more like #7:
$ bwameth.py --reference /mnt/stripe/b
wa-indexes/hg18.fa --threads 15 --prefix srx_tmp_hg18 ./SRR019600.fastq
running: bwa mem -T 40 -B 2 -L 10 -CM -R '@RG ID:bm SM:bm' -t 15 /mnt/stripe/bwa-indexes/hg18.fa
.bwameth.c2t '</home/user/.virtualenvs/env/bin/python /home/user/.virtualenvs/env/src/bwameth-head/bw
ameth.py c2t ./SRR019600.fastq NA'
writing to:
samtools view -bS - | samtools sort - srx_tmp_hg18
converting reads in ./SRR019600.fastq,NA
WARNING: running bwameth in single-end mode
[M::main_mem] read 1973686 sequences (150000136 bp)...
[M::mem_process_seqs] Processed 1973686 reads in 1042.692 CPU sec, 69.759 real sec
Traceback (most recent call last):
File "/home/user/.virtualenvs/env/bin/bwameth.py", line 10, in <module>
execfile(__file__)
File "/home/user/.virtualenvs/env/src/bwameth-head/bwameth.py", line 596, in <module>
main(sys.argv[1:])
File "/home/user/.virtualenvs/env/src/bwameth-head/bwameth.py", line 581, in main
set_as_failed=args.set_as_failed)
File "/home/user/.virtualenvs/env/src/bwameth-head/bwameth.py", line 259, in bwa_mem
as_bam(cmd, fa, prefix, calmd, set_as_failed)
File "/home/user/.virtualenvs/env/src/bwameth-head/bwameth.py", line 296, in as_bam
out.write(str(aln) + '\n')
IOError: [Errno 32] Broken pipe
[fputs] Broken pipe
Traceback (most recent call last):
File "/home/user/.virtualenvs/env/src/bwameth-head/bwameth.py", line 596, in <module>
main(sys.argv[1:])
File "/home/user/.virtualenvs/env/src/bwameth-head/bwameth.py", line 546, in main
sys.exit(convert_reads(args[1], args[2]))
File "/home/user/.virtualenvs/env/src/bwameth-head/bwameth.py", line 102, in convert_reads
out.write("".join((name, seq, "\n+\n", qual)))
IOError: [Errno 32] Broken pipe
Is there a way to debug this?
please see the latest commit.
Thank, but surprisingly the output remained the same. Probably because the error actually happens in the out.write
part. Any hint on how should I check my reads?
maybe if you check the partial bam output, you can see about where it errored and then send a subset of reads to bwa-meth.
What does
tail ./SRR019600.fastq
show?
I downloaded that fastq and am able to reproduce what you are seeing. Will post fix shortly.
Bellow is the tail
output. You probably have it already:
$ tail SRR019600.fastq
+SRR019600.7351165 EVANSGA_9:1:100:1792:1500 length=76
:0IIIIIII3E4I6IIIIIIII=III46IIIIII*IIIIIIIIII5II6II:IIIIIIF4A3DI>I%.;I$G=#0@
@SRR019600.7351166 EVANSGA_9:1:100:1792:1526 length=76
AGTTAATGGTAGAATTGAGATTAGAATTAATGATTAAGTTAGTGTAATAAATTGGTAATAGAGGATAAGTAAAATT
+SRR019600.7351166 EVANSGA_9:1:100:1792:1526 length=76
(I(IIIIIIBDIEIIIIIIIIIIIIIBBIHII=<?III+IIF1IIIIBIDGIIIH&?GII9III:@50<,G162>B
@SRR019600.7351167 EVANSGA_9:1:100:1792:1537 length=76
TGGGGAATAAAAGATTTTCATATTACGTTTTTTTGTATGTGATGTTTGTATGTAGAATTTATGGATTATTTTTTAT
+SRR019600.7351167 EVANSGA_9:1:100:1792:1537 length=76
IIIIIIIII;6GIH>IIII>IIIII=@7?IIIDFIDIIIDH0II.EII9I>I%BI61IIIGIII-3C6I3D7I6+;
The resulting BAM file seem to have only the header and no alignments.
this was occuring because I was using the csv module in reader which iterates over the SAM output. If the quality field started with ", it would mess things up. It requires an update to toolshed as well.
It did help, thank you!
Hmm, there're no exceptions but the whole process seems to be hanging after a certain point. Here's the log:
$ bwameth.py --reference /mnt/stripe/bwa-indexes/hg18.fa --threads 15 --prefix srx_tmp_hg18 ./SRR019600.fastq
running: bwa mem -T 40 -B 2 -L 10 -CM -R '@RG ID:bm SM:bm' -t 15 /mnt/stripe/bwa-indexes/hg18.fa.bwameth.c2t '</home/user/.virtualenvs/env/bin/python /home/user/.virtualenvs/env/src/bwameth-head/bwam
eth.py c2t ./SRR019600.fastq NA'
writing to:
samtools view -bS - | samtools sort - srx_tmp_hg18
converting reads in ./SRR019600.fastq,NA
WARNING: running bwameth in single-end mode
converting reads in ./SRR019600.fastq,NA
WARNING: running bwameth in single-end mode
[M::main_mem] read 1973686 sequences (150000136 bp)...
[M::main_mem] read 1973686 sequences (150000136 bp)...
[M::mem_process_seqs] Processed 1973686 reads in 1255.657 CPU sec, 117.232 real sec
[M::mem_process_seqs] Processed 1973686 reads in 1251.457 CPU sec, 119.854 real sec
[M::main_mem] read 1973686 sequences (150000136 bp)...
[M::mem_process_seqs] Processed 1973686 reads in 1099.928 CPU sec, 73.542 real sec
[M::main_mem] read 1973686 sequences (150000136 bp)...
[M::mem_process_seqs] Processed 1973686 reads in 1027.303 CPU sec, 68.685 real sec
WARNING: 7351167 reads with length < 80
: this program is designed for long reads
[M::main_mem] read 1430109 sequences (108688284 bp)...
[M::mem_process_seqs] Processed 1430109 reads in 763.489 CPU sec, 51.065 real sec
[main] Version: 0.7.9a-r786
[main] CMD: bwa mem -T 40 -B 2 -L 10 -CM -R @RG ID:bm SM:bm -t 15 /mnt/stripe/bwa-indexes/hg18.fa.bwameth.c2t </home/user/.virtualenvs/env/bin/python /home/user/.virtualenvs/env/src/bwameth-head/bwameth
.py c2t ./SRR019600.fastq NA
[main] Real time: 894.556 sec; CPU: 4167.274 sec
Also, duplicated messages look suspicious.
Update: it seems like bwa-meth
is starting two bwa processes:
$ ps aux | grep bwa
user 17692 0.0 0.0 202924 10992 pts/10 S+ 11:07 0:00 /home/user/.virtualenvs/env/bin/python /home/user/.virtualenvs/env/bin/bwameth.py --reference /mnt/stripe/bwa-indexes/hg18.fa --threads 15 --prefix srx_tmp_hg18 ./SRR019600.fastq
user 17694 900 7.4 15387576 14726120 pts/10 Sl+ 11:07 19:03 bwa mem -T 40 -B 2 -L 10 -CM -R @RG?ID:bm?SM:bm -t 15 /mnt/stripe/bwa-indexes/hg18.fa.bwameth.c2t </home/user/.virtualenvs/env/bin/python /home/user/.virtualenvs/env/src/bwameth-head/bwameth.py c2t ./SRR019600.fastq NA
user 17697 909 7.4 15387576 14767156 pts/10 Sl+ 11:07 19:15 bwa mem -T 40 -B 2 -L 10 -CM -R @RG?ID:bm?SM:bm -t 15 /mnt/stripe/bwa-indexes/hg18.fa.bwameth.c2t </home/user/.virtualenvs/env/bin/python /home/user/.virtualenvs/env/src/bwameth-head/bwameth.py c2t ./SRR019600.fastq NA
user 17698 9.6 0.0 200508 10692 pts/10 S+ 11:07 0:11 /home/user/.virtualenvs/env/bin/python /home/user/.virtualenvs/env/src/bwameth-head/bwameth.py c2t ./SRR019600.fastq NA
user 17699 9.9 0.0 200504 10692 pts/10 S+ 11:07 0:12 /home/user/.virtualenvs/env/bin/python /home/user/.virtualenvs/env/src/bwameth-head/bwameth.py c2t ./SRR019600.fastq NA
and after some time one of the bwas become a zombie:
user 17697 288 0.0 0 0 pts/10 Z+ 11:07 69:29 [bwa] <defunct>
Hello, can you reproduce the zombie issue?
I think the problem is in this line, which silently eats the exception from the csv.reader
and performs a second nopen
on the fname
.
Update: and here's the exception:
TypeError('quotechar must be set if quoting enabled',)
So, you should use dialect.quoting = csv.QUOTE_NONE
instead of dialect.quotechar = None
.
ah, you might be right. i am able to reproduce. but the .bam and .bai are still created. i will look into it.
On Mon, Jun 16, 2014 at 3:05 AM, Sergei Lebedev notifications@github.com wrote:
I think the problem is in this https://github.com/brentp/toolshed/blob/master/toolshed/files.py#L218 line, which silently eats the exception from the csv.reader and performs a second nopen on the fname.
— Reply to this email directly or view it on GitHub https://github.com/brentp/bwa-meth/issues/6#issuecomment-46155858.
I think I am still running into this problem (I updated both bwa-meth and toolshed to the latest github versions).
For some FASTQ pairs the process hangs without an error after some time. I noticed that it typically occurs after a new BAM temp file (i.e 0001) is created.
I have not been able to identify the what the problem with the FASTQ files is (although commas are present in quality scores of files that succeed and not in files that fail. The error is fully reproducible.
for aln in handle_reads(pair_list, set_as_failed):
out.write(str(aln) + '\n')
<<seems to hang here>>>
@mcieslik-mctp can you post files to reproduce?
Hello, the
master
version seems to be broken after 1eaa967de500f12b617847490f248c439f561093.Can you please take a look at
bwameth.py:291
?