bioinformatics-centre / bayesembler

A Bayesian method for doing transcriptome assembly from RNA-seq data
MIT License
25 stars 5 forks source link

bayesembler running errors #3

Open lmdu opened 9 years ago

lmdu commented 9 years ago

Recently, we use bayesembler to assemble transcriptome. We found 2 errors. First, the samtools in dependencies folder can not work well, but rebuild samtools can work well. Second bayesembler: /seqdata/krogh/jola/projects/transcriptome_assembly/code/release/bayesembler_1_1_1_pubrelease/src/assembler.cpp:894: void Assembler::graphConstructorCallback(std::string, std::list, std::string, boost::mutex, int, int_): Assertion `current_ref_strand == "."' failed.

jonassibbesen commented 9 years ago

Dear Mencent

Thanks for posting.

With regards to your first issue, it sounds like the pre-compiled version of Samtools that is bundled with the binary is not compatible with the version of linux you are running. What version are you running?

As for the second issue I can’t off the top of my head see how this assertion could be called. The assertion is called when the program parses the output from the program which we currently use to build the splice-graphs (CEM). Would it be possible for you to provide the command-line you used to run the Bayesembler together with the output from CEM? You can find the latter by looking for the files with the extension .instance in the directory were you ran the Bayesembler.

Moreover from the assertion it looks like you are running on unstranded data. If this is the case you should know that we are working on a new version which is gonna handle unstranded data in a more sensitive manner. We are planning on releasing this version in the coming weeks.

Best wishes,

Jonas

nat2bee commented 9 years ago

Hi, I had a similar problem. This is the command I gave:

bayesembler -p 30 -b accepted_hits.bam

This is the complete output but I could not find any file with the ".instance" extension:

You are using the Bayesembler v1.1.1. For more information go to bayesembler.binf.ku.dk

[06/05/2015 11:21:06] Removing duplicate reads [bam_index_core] truncated file? Continue anyway. (-3) [06/05/2015 12:16:14] Removed duplicates from 81316787 mapped read pairs [06/05/2015 12:16:14] Wrote 65025620 read pairs used for splice-graph construction

[06/05/2015 12:16:14] Spawning graph construction thread [06/05/2015 12:16:14] Generating splice-graphs from accepted_hits_nd_unstranded.bam using cem [main_samview] truncated file. bayesembler: /seqdata/krogh/jola/projects/transcriptome_assembly/code/release/bayesembler_1_1_1_pubrelease/src/assembler.cpp:672: void Assembler::graphConstructorCallback(std::string, std::list, std::string, boost::mutex, int, int_): Assertion `system(sam_system_stream_string.c_str()) == 0' failed.

lassemaretty commented 9 years ago

Hi,

Thank you for posting. It sounds like an issue with samtools. Ensure that you have a working installation of samtools and try

export SAMTOOLS_PATH=/your/path/to/samtools

Please let me know if this does not fix your problem.

/Lasse

nat2bee commented 9 years ago

Hello Lasse,

This time I got a different error:

You are using the Bayesembler v1.1.1. For more information go to bayesembler.binf.ku.dk

[07/05/2015 12:17:12] Removing duplicate reads nice: /home/nsa/scratch/local/programs/samtools: Permission denied _bayesembler: /seqdata/krogh/jola/projects/transcriptome_assembly/code/release/bayesembler_1_1_1_pubrelease/src/assembler.cpp:266: void Assembler::generateBamIndex(std::string): Assertion `_system(idx_system_string.cstr()) == 0' failed.

But I am not sure why I got a permission problem if the samtools folder has been created and installed by my user (with no root privileges).

lassemaretty commented 9 years ago

Hi nat2bee,

I think that you lack executable permission to samtools. As indexing the deduplicated bamfile generated by the bayesembler is the first call to samtools, any issues related to samtools availability should arise at this stage as it does in your last post. However, this implies that my last post was likely a bit rushed as setting the samtools env var likely wont fix you original problem, which occurs in relation to the second samtools call that translates the deduped bam to sam format (as required by the splice-graph assembler program CEM). So maybe just do

unset SAMTOOLS_PATH

to get us back to the previous problem.

Im not exactly sure what is then causing the problem, but it maybe related to the truncated file warnings. Try to convert the dedup temporary bam file generated by the bayesembler to sam format (please use the samtools 0.1.19 binary shipped with the bayesembler binary)

samtools_0.1.19 view your_prefix_nd_unstranded.bam > your_prefix_nd_unstranded.sam

and your original bamfile

samtools_0.1.19 view your_prefix.bam > your_prefix.sam

and let us know how it goes. If that call crashes on any of your files, maybe also try a long shot by downloading the most recent version of samtools and give that a go.

Alternatively, you can make the bam file (or a sample of it) available to us and we can try to reproduce/resolve the issue on our system.

/Lasse

TomSmithCGAT commented 9 years ago

Hi lassemaretty,

I'm getting a similar error.

I downloaded the static binaries and when I run bayesembler I get the following error message:

bam_nd_pe_plus_file_nametestCLL006D-S3-1-L001tophat2_nd_plus.bam [28/07/2015 15:45:42] Removing duplicate reads /ifs/apps/bio/bayesembler-1.2.0/dependencies/samtools_0.1.19/samtools: /lib64/libz.so.1: versionZLIB_1.2.3.3' not found (required by /ifs/apps/bio/bayesembler-1.2.0/dependencies/samtools_0.1.19/samtools) bayesembler: /seqdata/krogh/jola/projects/transcriptome_assembly/code/release/bayesembler_1_2_0/src/assembler.cpp:266: void Assembler::generateBamIndex(std::string): Assertion system(idx_system_string.c_str()) == 0' failed. Aborted

So I tried the export path command you suggested and I get the following error message. export SAMTOOLS_PATH=/ifs/apps/bio/samtools-0.1.19/bin/

[28/07/2015 15:46:31] Removing duplicate reads nice: /ifs/apps/bio/samtools-0.1.19/bin/: Permission denied bayesembler: /seqdata/krogh/jola/projects/transcriptome_assembly/code/release/bayesembler_1_2_0/src/assembler.cpp:266: void Assembler::generateBamIndex(std::string): Assertionsystem(idx_system_string.c_str()) == 0' failed. Aborted`

I also tried directly replacing the samtools executable in the dependencies folder with a soft link to the executable and I get the same error message about permission.

I have the permission to execute so I'm a little confused what the error message means? lt /ifs/apps/bio/samtools-1.2/bin/

-rwxr-xr-x 1 andreas usersfgu 2.8M Apr 29 10:27 samtools

Tom

lassemaretty commented 9 years ago

Hi Tom,

Thank you for posting. Im sorry about your troubles. Try to append samtools to your path:

export SAMTOOLS_PATH=/ifs/apps/bio/samtools-0.1.19/bin/samtools

Let me know if this doesn't fix problem.

/Lasse

TomSmithCGAT commented 9 years ago

Hi Lasse,

Thanks for the quick reply.

I tried appending samtools to my path as you suggest and got the following error message:

bayesembler: /seqdata/krogh/jola/projects/transcriptome_assembly/code/release/bayesembler_1_2_0/src/assembler.cpp:688: void Assembler::graphConstructorCallback(std::string, std::list<GraphInfo>*, std::string, boost::mutex*, int*, int*): Assertionsystem(instance_system_string.c_str()) == 0' failed. Aborted`

So I appended cem to my path too and now everything seems to be working which is great.

Is there any way to get Bayesembler to use the correct samtools and cem without creating these global variables as I'd like to run jobs on our cluster without having to create the global variables beforehand. How are samtools and cem called by Bayesembler?

Tom