pachterlab / kallisto

Near-optimal RNA-Seq quantification
https://pachterlab.github.io/kallisto
BSD 2-Clause "Simplified" License
654 stars 172 forks source link

Kallisto takes a lot of time for some small size libraries #293

Open smoretti opened 3 years ago

smoretti commented 3 years ago

Hi

We use Kallisto v0.46.0 on Linux and it works very well. But sometimes for some small size libraries it takes much much longer than expected

Those libraries are small size, mostly single-end and with short reads (about 30-36). So we use a short k-mer index on them: 15.

Here is the list of those libraries

SRX1065256  SRP063008   9823    Sus scrofa
ERX167126   ERP001954   9823    Sus scrofa
ERX167070   ERP001954   9823    Sus scrofa
ERX167059   ERP001954   9823    Sus scrofa
SRX084382   SRP007512   10181   Heterocephalus  glaber
SRX084381   SRP007512   10090   Mus musculus
SRX147579   GSE36026    10090   Mus musculus
SRX147580   GSE36026    10090   Mus musculus
SRX147581   GSE36026    10090   Mus musculus
SRX147582   GSE36026    10090   Mus musculus
SRX147583   GSE36026    10090   Mus musculus
SRX147584   GSE36026    10090   Mus musculus
SRX147590   GSE36026    10090   Mus musculus
SRX147591   GSE36026    10090   Mus musculus
SRX147592   GSE36026    10090   Mus musculus
SRX147593   GSE36026    10090   Mus musculus

Do you know what could cause this much longer running time?

Usually we don't see such thing with other short read libraries.

amdreamer commented 1 year ago

I also have the same question. My paired-end RNA-seq read length is 502, using index of 31 kmer, the quant process takes hours, however for 1502, it only takes minutes. Have you solved this?