pachterlab / kallisto

Near-optimal RNA-Seq quantification
https://pachterlab.github.io/kallisto
BSD 2-Clause "Simplified" License
655 stars 172 forks source link

kallisto index gets stuck on CentOs 7 #107

Closed sariels1 closed 5 years ago

sariels1 commented 8 years ago

I have tried to index a large metagenomic reference set using the binary distribution kallisto_linux-v0.42.5 on a machine with CentOs 7.1.1503. The process did not go past the "counting k-mers" step. The problem is that it is now impossible to kill the process without restarting the whole server (kill -9 does nothing).

The same command worked and completed successfully on a less powerful server with CentOs 5.11.

I am going to try and compile the code from source to see if this might resolve the problem.

lorian commented 8 years ago

So I've never encountered a kallisto process that won't die, but the indexing itself is memory-limited -- if the CentOS server had more RAM than the server you're currently running on, that would explain why it succeeded there and failed now.

On Tue, Apr 26, 2016 at 8:57 AM, Ariel Schwartz notifications@github.com wrote:

I have tried to index a large metagenomic reference set using the binary distribution kallisto_linux-v0.42.5 on a machine with CentOs 7.1.1503. The process did not go past the "counting k-mers" step. The problem is that it is now impossible to kill the process without restarting the whole server (kill -9 does nothing).

The same command worked and completed successfully on a less powerful server with CentOs 5.11.

I am going to try and compile the code from source to see if this might resolve the problem.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/pachterlab/kallisto/issues/107

sariels1 commented 8 years ago

RAM is not an issue. We have 1TB RAM on both servers. The process is stuck, but still technically running using only 154 GB RAM. 44982 aschwar+ 20 0 0.154t 0.154t 1116 R 0.0 15.7 1182:22 kallisto

Trying to kill it does absolutely nothing.

sariels1 commented 8 years ago

Compiling from source did not solve the problem either (now I have two stuck processes). Actually I was able to eventually kill the newly compiled version.

sariels1 commented 8 years ago

I have also noticed that kallisto quant takes hours (17+) to get from [index] number of k-mers: 8,539,291,014 to [index] number of equivalence classes: 66,363,546

While on the older machine the whole process completed in less than 2 hours.

kaji331 commented 8 years ago

I also find that kallisto 0.42.5 takes too many hours to quant my one pair data (2~3 GB/file). I found it used only one thread, even I added 4 threads parameter. My server is on RHEL 6, has 12 cores and 32 GB memory. After 24 hours, I checked nohup file and was told that kallisto was running for persudoalignment step. I quited it and tried to compile it from source. Unfortunately, it needs c++11 support. Sad~

pmelsted commented 8 years ago

Hi @kaji331 what version of glibc are you running? You can find out using ldd --version or from http://unix.stackexchange.com/questions/120380/what-c-library-version-does-my-system-use

I suspect this has to do with memory allocation and multicore.

pmelsted commented 8 years ago

I've tried to reproduce this with a centos 7 docker image, but I don't see any difference in running times compared to running it on the ubuntu machine.

kaji331 commented 8 years ago

@pmelsted I did ldd --version:2.19