pachterlab / kallisto

Near-optimal RNA-Seq quantification
https://pachterlab.github.io/kallisto
BSD 2-Clause "Simplified" License
648 stars 170 forks source link

ERROR: while making index file #279

Open sekhwal opened 4 years ago

sekhwal commented 4 years ago

Hi, I am using the following command to make the index file. However, I am getting an ERROR.

./kallisto index -i /DataAnalysis/kallisto_test/transcripts.idx /DataAnalysis/kallisto_test/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz

ERROR: [build] loading fasta file /DataAnalysis/kallisto_test/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz [build] k-mer length: 31 [build] warning: replaced 153901651 non-ACGUT characters in the input sequence with pseudorandom nucleotides [build] counting k-mers ... terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Aborted (core dumped)

mschilli87 commented 4 years ago

@sekhwal

[build] counting k-mers ... terminate called after throwing an instance of 'std::bad_alloc'

This looks like you ran out of memory. Can you share some details about the machine you run this on and the reference? To me it looks like you use the human genome as a reference. However, kallisto is working on the transcriptome. Basically, your command is trying to treat each chromosome/contig as one enormous transcript you want to quantify. I highly recommend re-reading the kallisto paper to make sure you understand what it does and you actually want to use it. If so, you might want to consider using a pre-build index for the human transcriptome unless you have some more specific use case.