transrate "stuck"? - Githubissues

sjannielefevre commented 8 years ago

I am running a read-mapping based transrate command on a trinity assembly with paired-end reads. The mapping has been done, as confirmed by the presence of a 210gb .bam file (I have done mapping before so this is the right size). There have been no error messages. Currently, the output is "stuck" at the following: [ INFO] 2016-02-23 13:07:04 : Calculating read diagnostics... And has been for several days. Snap-aligner is using huge amounts of memory (500gb), but went from running with 20cpus to 2 cpus at some point, probably feb 26th, which is when the .bam file was finished (latest time stamp on that file). No files are being written, no files have been changed since feb 23, except the .bam.... What is going on? Does it really take this long to do the calculation or has an error occurred that is simply not reported? I am working on a cluster with 1000gb ram, so I want to make sure I am not using up most of the memory without actually getting results.

I hope someone can help.

Cheers Sjannie

blahah commented 8 years ago

It's quite possible that SNAP has got stuck and isn't doing anything. You could try killing transrate and then restarting - if the bam file has been created and is valid transrate will continue with the next steps.

sjannielefevre commented 8 years ago

Thanks for the quick reply

I restarted it, and now it finished, but not succesfully.

This is the command I ran:

transrate-1.0.2-linux-x86_64/transrate --assembly Trinity_fixed.fa --left all_R1.fq --right all_R2.fq --threads 20 --output transrate/

I still only got to

[ INFO] 2016-02-29 09:54:00 : Calculating read diagnostics...

[ERROR] 2016-02-29 11:21:19 : Version Info: A newer version of Salmon with impor
tant new features is availaible.
Please check https://github.com/COMBINE-lab/salmon/releases, for the latest release.
# salmon (alignment-based) v0.4.2
# [ program ] => salmon
# [ command ] => quant
# [ libType ] => { IU }
# [ alignments ] => { /node/work1/no_backup/sjannie/transrate/Trinity_fixed/all_R1.fq.all_R2.fq.Trinity_fixed.bam }
# [ targets ] => { /node/work1/no_backup/sjannie/Trinity_fixed.fa }
# [ threads ] => { 20 }
# [ sampleOut ] => { }
# [ sampleUnaligned ] => { }
# [ output ] => { . }
# [ useVBOpt ] => { }
# [ useErrorModel ] => { }
Library format { type:paired end, relative orientation:inward, strandedness:unstranded }
Logs will be written to ./logs
numQuantThreads = 14
parseThreads = 6
Checking that provided alignment files have consistent headers . . . done
^M^MESC[0;32mprocessedESC[0;31m 0 ESC[0;32mreads in current roundESC[0;0m^M^MESC[0;32mprocessedESC[0;31m 1000000 ESC[0;32mreads in current roundESC[0;0m^M^MESC[0;32mprocessedESC[0;31m 2000000 ESC[0;32mreads in current roundESC[0;0m^M^MESC[0;32mprocessedESC[0;31m 3000000 ESC[0;32mreads in current roundESC[0;0m[2016-02-29 09:56:30.897] [jointLog] [info]

The alignment group queue pool has been exhausted.  6148930 extra fragments were allocated on the heap to saturate the pool.  No new fragments will be allocated

^M^MESC[0;32mprocessedESC[0;31m 4000000 ESC[0;32mreads in current roundESC[0;0m
^M^MESC[0;32mprocessedESC[0;31m 5000000 ESC[0;32mreads in current roundESC[0;0m

. done^M^Mkilling thread 13 . . . done

# observed = 947720141 / # required = 50000000ESC[AESC[AESC[AESC[AESC[A[2016-02-
29 11:19:22.856] [jointLog] [info]

Completed first pass through the alignment file.
Total # of mapped reads : 947720141
# of uniquely mapped reads : 251443438
# ambiguously mapped reads : 696276703

[2016-02-29 11:19:58.253] [jointLog] [info] Computed 31649855 rich equivalence c
lasses for further processing
[2016-02-29 11:19:58.253] [jointLog] [info] starting optimizer
[2016-02-29 11:20:04.671] [jointLog] [info] Marked 0 weighted equivalence classes as degenerate
[2016-02-29 11:20:05.780] [jointLog] [info] iteration = 0 | max rel diff. = 1294.94

Freeing memory used by read queue . . .
Joined parsing thread . . . "/node/work1/no_backup/sjannie/transrate/Trinity_fixed/all_R1.fq.all_R2.fq.Trinity_fixed.bam"
Closed all files . . .
Emptied frag queue. . .
Emptied Alignemnt Group Pool. .
Emptied Alignment Group Queue. . . done
============
Exception : [Error in function boost::math::digamma<double>(double): numeric overflow]
============
/usit/abel/u1/sjannies/nobackup/transrate-1.0.2-linux-x86_64/bin/salmon alignment-quant was invoked improperly.
For usage information, try /usit/abel/u1/sjannies/nobackup/transrate-1.0.2-linux-x86_64/bin/salmon quant --help-alignments
Exiting.

[ERROR] 2016-02-29 11:21:19 : Salmon failed

I can see from my .err file (output below) that it ran it as not strand-specific - I had missed the option to specific strand-specific (because it is not there, I understand now), which my reads are... That might explain the low mapping rate? I guess there is no need for me to continue further to try to resolve the problems above, if it does not work with strand-specific reads anyways.

:-/

sjannielefevre commented 8 years ago

Sorry for the messy post, did not know it would look like that.

blahah commented 8 years ago

@sjannielefevre the issue is unrelated to the strand specificity of the reads - what happened in the later case is that the bam file was corrupt and had not been successfully written by SNAP.

This error in SNAP has been observed before but we haven't yet worked out what conditions cause it. Many users have found that deleting the transrate output and re-running from scratch will work.

Thanks for your patience - we will move away from SNAP soon but in the meantime we're working to resolve this bug.

It would be easy for us to support strand specific reads explicitly - we will do this in an upcoming release.

blahah commented 8 years ago

@sjannielefevre on a closer read of the error log it looks like Salmon completed OK but then crashed when it had finished. Running the transrate command one more time without deleting the directory first might work.

We'll keep working on the bug. Thanks for your report

sjannielefevre commented 8 years ago

Ok - I will try again, thanks!

sjannielefevre commented 8 years ago

Just want to let you know that transrate completed now, without errors, after I just cancelled and then ran the command again without doing anything else.

It would be neat to have the strand-specific option, so I will look out for upcoming releases.

Thanks again.

Cheers Sjannie

blahah commented 8 years ago

Great! Glad to hear it worked and thanks for your patience.

blahah / transrate

transrate "stuck"? #184