lindenb / jvarkit

Java utilities for Bioinformatics
https://jvarkit.readthedocs.io/
Other
482 stars 133 forks source link

Blast2Sam error #53

Closed xx152 closed 8 years ago

xx152 commented 8 years ago

Hi,

I am using blast2sam from jvarkit. It worked fine at first. Then I started to get an Error message

This is my command: java -jar ../jvarkit/dist/blast2sam.jar -r reference.fa -o blast_output.sam blast_output.xml [INFO/BlastToSam] 2016-05-23 17:57:36 "Starting JOB at Mon May 23 17:57:36 CEST 2016 com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam version=265b0d11a280ad1458038fbd838a7a866952facf built=2016-04-22:11-04-17" [INFO/BlastToSam] 2016-05-23 17:57:36 "Command Line args : -r reference.fa -o blast_output.sam blast_output.xml" [INFO/BlastToSam] 2016-05-23 17:57:36 "Executing as nkamal@zorin.CeBiTec.Uni-Bielefeld.DE on Linux 3.19.8-100.fc20.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_77-b03" [INFO/BlastToSam] 2016-05-23 17:57:36 "opening reference.fa" [INFO/BlastToSam] 2016-05-23 17:57:36 "Reading from blast_output.xml" [INFO/BlastToSam] 2016-05-23 17:57:36 "resolveEntity:-//NCBI//NCBI BlastOutput/EN/http://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.dtd/null" [SEVERE/BlastToSam] 2016-05-23 17:57:38 "null" java.lang.IllegalStateException at com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam.convertIterationToSequenceIteration(BlastToSam.java:288) at com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam.run_single(BlastToSam.java:239) at com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam.doWork(BlastToSam.java:739) at com.github.lindenb.jvarkit.util.AbstractCommandLineProgram.instanceMain(AbstractCommandLineProgram.java:501) at com.github.lindenb.jvarkit.util.AbstractCommandLineProgram.instanceMainWithExit(AbstractCommandLineProgram.java:515) at com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam.main(BlastToSam.java:761) [INFO/BlastToSam] 2016-05-23 17:57:38 "End JOB status=-1 [Mon May 23 17:57:38 CEST 2016] com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam done. Elapsed time: 0.04 minutes." [SEVERE/BlastToSam] 2016-05-23 17:57:38 "##### ERROR: return status = -1################"

Then I reinstalled it.

java -jar jvarkit/dist/blast2sam.jar -r reference.fa -o blast_output.sam blast_output.xml [main] INFO jvarkit - Starting JOB at Mon May 23 17:54:20 CEST 2016 com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam version=ea11a24eac02ecb6ad28cadeefb035ae076e5a9d built=2016-05-23:16-05-50 [main] INFO jvarkit - Command Line args : -r reference.fa -o blast_output.sam blast_output.xml [main] INFO jvarkit - Executing as nkamal@zorin.CeBiTec.Uni-Bielefeld.DE on Linux 3.19.8-100.fc20.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_77-b03 java.lang.RuntimeException: Option r was not defined ot dictionary missing at com.github.lindenb.jvarkit.util.command.Command.wrapException(Command.java:262) at com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam.call(BlastToSam.java:644) at com.github.lindenb.jvarkit.tools.blast2sam.AbstractBlastToSam.call(AbstractBlastToSam.java:494) at com.github.lindenb.jvarkit.tools.blast2sam.AbstractBlastToSam.call(AbstractBlastToSam.java:32) at com.github.lindenb.jvarkit.util.command.Command.instanceMainWithExceptions(Command.java:549) at com.github.lindenb.jvarkit.util.command.Command.instanceMain(Command.java:586) at com.github.lindenb.jvarkit.util.command.Command.instanceMainWithExit(Command.java:592) at com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam.main(BlastToSam.java:739) [main] ERROR jvarkit - Option r was not defined ot dictionary missing [main] ERROR jvarkit - Command failed

it says option r is missing, but it is not. Also I have a sequence dictionary created using Picardtools and samtools faidx. They are fine. My xml output is fine too. I am using java 1.8. The blast2sam tool did work before but now it doesn't anymore. I'm not a java programmer and can't get what is wrong. I would very much appreciate some help. Many thanks!!

Nadia

lindenb commented 8 years ago

from your working directory, what the output of the following commands please:

$ file reference.fa
$ file reference.fa.faidx
$ file reference.dict
xx152 commented 8 years ago

reference.fa: ASCII text, with very long lines reference.fa.faidx: ASCII text reference.dict: ASCII text

lindenb commented 8 years ago

got it, a negate was missing in my new version .

https://github.com/lindenb/jvarkit/commit/0948060adc5d72d53f29e1adc3fd77dc7979f661

Can you test it please ?

xx152 commented 8 years ago

Thanks a lot.

It now looks like this:

java -jar jvarkit/dist/blast2sam.jar -r reference.fa -o out.sam blast_output.xml [main] INFO jvarkit - Starting JOB at Mon May 23 19:45:57 CEST 2016 com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam version=0948060adc5d72d53f29e1adc3fd77dc7979f661 built=2016-05-23:19-05-16 [main] INFO jvarkit - Command Line args : -r reference.fa -o out.sam blast_output.xml [main] INFO jvarkit - Executing as nkamal@zorin.CeBiTec.Uni-Bielefeld.DE on Linux 3.19.8-100.fc20.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_77-b03 [main] INFO jvarkit - opening reference.fa [main] INFO jvarkit - Reading from blast_output.xml [main] INFO jvarkit - resolveEntity:-//NCBI//NCBI BlastOutput/EN/http://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.dtd/null [main] INFO jvarkit - Saving sam/bam to out.sam java.lang.IllegalStateException at com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam.convertIterationToSequenceIteration(BlastToSam.java:307) at com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam.run_single(BlastToSam.java:260) at com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam.call(BlastToSam.java:711) at com.github.lindenb.jvarkit.tools.blast2sam.AbstractBlastToSam.call(AbstractBlastToSam.java:494) at com.github.lindenb.jvarkit.tools.blast2sam.AbstractBlastToSam.call(AbstractBlastToSam.java:32) at com.github.lindenb.jvarkit.util.command.Command.instanceMainWithExceptions(Command.java:549) at com.github.lindenb.jvarkit.util.command.Command.instanceMain(Command.java:586) at com.github.lindenb.jvarkit.util.command.Command.instanceMainWithExit(Command.java:592) at com.github.lindenb.jvarkit.tools.blast2sam.BlastToSam.main(BlastToSam.java:731) [main] ERROR jvarkit - null [main] ERROR jvarkit - Command failed

lindenb commented 8 years ago

how was blast_output.xml generated please ? which version of blast ? I saw some programs failing because an old version of blast was used .

lindenb commented 8 years ago

BTW did you have a look at this tool https://github.com/guyduche/Blast2Bam ?

lindenb commented 8 years ago

For reference, people using the 'old' blastall : https://github.com/lindenb/jvarkit/issues/51#issuecomment-216925919

xx152 commented 8 years ago

I am using blast 2.3.0. This is my command: blastn -query all_haplotypes.filtered.fasta -db reference.fa -outfmt 5 -num_alignments 1 -dust 'yes' -max_hsps 1 -parse_deflines -out blast_output.xml

I did try the other tool too actually. This was the command: Blast2Bam-master/bin/blast2bam -o out.bam blast_output.xml reference.dict all_haplotypes.filtered.fasta

It told me: [blastSam.c:371]:Error while printing the Sam header

so I gave it a -R option Blast2Bam-master/bin/blast2bam -R '@RG\tID:SM' -o out.bam blast_output.xml reference.dict all_haplotypes.filtered.fasta

And I got: [blastSam.c:371]:Error while printing the Sam header

* Error in `/homes/nkamal/Tools/Blast2Bam-master/bin/blast2bam': munmap_chunk(): invalid pointer: 0x00007fff7746a45b *

======= Backtrace: ========= /lib64/libc.so.6[0x393f275a4f] /lib64/libc.so.6[0x393f27b8a7] /homes/nkamal/Tools/Blast2Bam-master/bin/blast2bam[0x401533] /lib64/libc.so.6(__libc_start_main+0xf5)[0x393f221d65] /homes/nkamal/Tools/Blast2Bam-master/bin/blast2bam[0x40160d] ======= Memory map: ======== 00400000-00408000 r-xp 00000000 00:47 148010 /homes/nkamal/Tools/Blast2Bam-master/bin/blast2bam 00607000-00608000 r--p 00007000 00:47 148010 /homes/nkamal/Tools/Blast2Bam-master/bin/blast2bam 00608000-00609000 rw-p 00008000 00:47 148010 /homes/nkamal/Tools/Blast2Bam-master/bin/blast2bam 00beb000-00c0c000 rw-p 00000000 00:00 0 [heap] 393ea00000-393ea20000 r-xp 00000000 fd:01 526170 /usr/lib64/ld-2.18.so 393ec1f000-393ec20000 r--p 0001f000 fd:01 526170 /usr/lib64/ld-2.18.so 393ec20000-393ec21000 rw-p 00020000 fd:01 526170 /usr/lib64/ld-2.18.so 393ec21000-393ec22000 rw-p 00000000 00:00 0 393ee00000-393ee18000 r-xp 00000000 fd:01 526269 /usr/lib64/libpthread-2.18.so . . . . . . . 393f5b7000-393f5b9000 rw-p 001b7000 fd:01 526171 /usr/lib64/libc-2.18.so 393f5b9000-393f5be000 rw-p 00000000 00:00 0 393f600000-393f603000 r-xp 00000000 fd:01 526268 /usr/lib64/libdl-2.18.so 393f603000-393f802000 ---p 00003000 fd:01 526268 /usr/lib64/libdl-2.18.so 393f802000-393f803000 r--p 00002000 fd:01 526268 /usr/lib64/libdl-2.18.so 393f803000-393f804000 rw-p 00003000 fd:01 526268 /usr/lib64/libdl-2.18.so 393fa00000-393fb05000 r-xp 00000000 fd:01 527804 /usr/lib64/libm-2.18.so 393fb394a35e000-394a55d000 ---p 0015e000 fd:01 527842 /usr/lib64/libxml2.so.2.9.1 394a55d000-394a565000 r--p 0015d000 fd:01 527842 /usr/lib64/libxml2.so.2.9.1 394a565000-394a567000 rw-p 00165000 fd:01 527842 /usr/lib64/libxml2.so.2.9.1 394a567000-394a569000 rw-p 00000000 00:00 0 7f81300b4000-7f81300ba000 rw-p 00000000 00:00 0 7f81300c1000-7f81300c9000 rw-p 00000000 00:00 0 7f81300d0000-7f81300d2000 rw-p 00000000 00:00 0 7fff7744a000-7fff7746b000 rw-p 00000000 00:00 0 [stack] 7fff77533000-7fff77535000 r--p 00000000 00:00 0 [vvar] 7fff77535000-7fff77537000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Aborted

lindenb commented 8 years ago

FYI: I won't help you with Blast2Bam-master/bin/blast2bam , it's not my personal project

lindenb commented 8 years ago

I've added a few 'logs' to my program, it may to see those messages. https://github.com/lindenb/jvarkit/commit/f57c7cf2fa058a0be67081c03acce24c2a5cf00f

xx152 commented 8 years ago

Sure, sorry.

Ok there seems to be a problem with the input here. I'll try to figure it out. Thanks a lot for your help!

lindenb commented 8 years ago

you can always test your blast.xml using:

xmllint --noout --valid blast_output.xml
xx152 commented 8 years ago

great thanks. Seems to be valid though.

lindenb commented 8 years ago

I haven't much used my blast2sam. It can always be an error on my side. But without the xml, it will be hard to debug.

xx152 commented 8 years ago

you are welcome to use them if you like.

(files deleted)

lindenb commented 8 years ago

Here is the problem: all your read are named the same way:

  <Iteration_query-def>No definition line</Iteration_query-def>
  <Iteration_query-def>No definition line</Iteration_query-def>
  <Iteration_query-def>No definition line</Iteration_query-def>
  <Iteration_query-def>No definition line</Iteration_query-def>
  <Iteration_query-def>No definition line</Iteration_query-def>
  <Iteration_query-def>No definition line</Iteration_query-def>
  <Iteration_query-def>No definition line</Iteration_query-def>
  <Iteration_query-def>No definition line</Iteration_query-def>
  <Iteration_query-def>No definition line</Iteration_query-def>
  <Iteration_query-def>No definition line</Iteration_query-def>
  <Iteration_query-def>No definition line</Iteration_query-def>
  <Iteration_query-def>No definition line</Iteration_query-def>
  <Iteration_query-def>No definition line</Iteration_query-def>
  <Iteration_query-def>No definition line</Iteration_query-def>
  <Iteration_query-def>No definition line</Iteration_query-def>

the program assumes the query having the same def are the very same read. The program crashed because the read content was not the same from one '' to another. The solution would be to rename the reads before aligning with blast.

xx152 commented 8 years ago

right! Thanks so much for your help! The problem was that I used the blast option "-parse_deflines". The original reads do have unique names. Now it all works perfectly. Thanks a lot!