Open aungthurhahein opened 8 years ago
What is the error reported? We fixed an issue in v0.42.5 which could be the cause of this. Can you run this on the latest versio 0.42.5
Kallisto version is 0.42.4 and this is the error message:
finding pseudoalignments for the reads ...Segmentation fault (core dumped)
I will download v0.42.5 and try again. I will get back to you with the outcome.
I tried with kallisto ver. 0.42.5 with the following command and the error still persists.
Command:
kallisto quant -l 605 -s 136 -i Trinity.fasta.kallisto_idx -o aln_out --pseudobam --single lib.fasta
Program halted with the following error:
@SQ SN:c10356_g10356_i1 LN:1195
@PG ID:kallisto PN:kallisto VN:0.42.5
Segmentation fault (core dumped)
In your case you seem to have only a single sequence in your index, can you confirm that this is what you expected?
Can you run kallisto quant without the pseudobam, I'm just trying to isolate whether there is a problem with the pseudobam or other parts.
can you also report what is written to stderr when you run your command as
kallisto quant -l 605 -s 136 -i Trinity.fasta.kallisto_idx -o aln_out --pseudobam --single lib.fasta > lib.sam
The index file has more than one sequence. I just reported the end of the stdout.
I can run kallisto quant without generating pseudobam successfully.
This is the output of both stdout and stderr:
[quant] fragment length distribution is truncated gaussian with mean = 605, sd = 136
[index] k-mer length: 31
[index] number of targets: 10,357
[index] number of k-mers: 5,947,007
[index] number of equivalence classes: 24,998
[quant] running in single-end mode
[quant] will process file 1: /colossus/home/anuphap/EST/EST_lib_IDs/pm/slect_bytissues_pm_chula/PM82_wTempLibID_04092014.txt.PmTwI.seqID.fasta
[quant] finding pseudoalignments for the reads ...@HD VN:1.0
@SQ SN:c0_g0_i1 LN:216
@SQ SN:c1_g1_i1 LN:374
@SQ SN:c2_g2_i1 LN:197
...
@SQ SN:c10354_g10354_i1 LN:594
@SQ SN:c10355_g10355_i1 LN:682
@SQ SN:c10356_g10356_i1 LN:1195
@PG ID:kallisto PN:kallisto VN:0.42.5
Segmentation fault (core dumped)
Also, "core.xxxx" file is written inside the working directory.
The sequences you are aligning have the ending .fasta
are they truly FASTA entries and not FASTQ. Because pseudoalignment outputs SAM files which are required to have a quality string kallisto (probably) fails because it has no quality string.
I'll have to check for this a bit more carefully when doing pseudoalignment.
kallisto never uses the quality values so you can supply a dummy value, essentially converting the FASTA file to a FASTQ files.
You can try this by just converting the first few sequences of the input file to FASTQ
Yes.I confirmed that .fasta
file has no quality file.
I didn't mention it before because don't expect that it can be the cause of the issue.
I will test with .fastq
file format and report the outcome soon.
I also ran into a segfault when generating the pseudobam, but due to a slightly different problem . I was running kallisto using process substitution to deal with an interleaved paired end file e.g
kallisto quant -t 8 -i kallisto.idx -o my_sample --pseudobam <(seqtk seq -1 interleaved.fq) <(seqtk seq -2 interleaved.fq)
Kallisto runs perfectly fine without the --pseudobam flag, but it crashes if I request the pseudobam.
I figured the pseudobam needs re-reading the fastq files, so I tried doing the split beforehand and then the seg fault does not happen (runs fine).
Would be nice to add this to the docs at least :). A nice would have also would be support for interleaved paired end files :)
Hello, I face this problem: " [ bam] writing pseudoalignments to BAM format .. Segmentation fault" and I have no idea how to fix it. I have smartseq.2 single reads, dual indexed ( this is who the fastq reads look like: @NB551291:160:H55CJBGXF:1:11101:12947:14932 1:N:0:TAAGGCGA+GCGATCTA GGCGTGTCCCGCGCGTGTGGGGGGAACCTCCGCGTCGGTGTTCCCCCGCCGGGTCCGCCCCCCGGGCCGCGGTTTT + AAAA/EAAAEEEA/EEAEEEAEE/E/EEEEEEAEA/EEEEEEEEEEEEEAEEEE/E/EAEEAEEE6AE/</EA/// )
I run this pipeline: [user@vm-129-49 mouse1.fastq_gz]$ kallisto quant -i /ad/vlachou/scRNAseq.2/kallisto_analysis/gencode.vM24.transcripts.idx --output-dir /ad/vlachou/scRNAseq.2/kallisto_analysis/kallisto_quant/gencode_indexed/mouse1 --pseudobam --genomebam --gtf /vlachou/scRNAseq.2/kallisto_analysis/gencode.vM24.annotation.gtf.gz --single -l 530 -s 150 -t 16 *fastq.gz
this is the outcome message: [quant] fragment length distribution is truncated gaussian with mean = 530, sd = 150 [index] k-mer length: 31 [index] number of targets: 142,552 [index] number of k-mers: 120,672,054
[quant] finding pseudoalignments for the reads ... done [quant] processed 482,819,438 reads, 208,880,499 reads pseudoaligned [ em] quantifying the abundances ... done [ em] the Expectation-Maximization algorithm ran for 1,273 rounds [ bam] writing pseudoalignments to BAM format .. Segmentation fault I tried the same with esnembl as reference but I get the same problem.
If anyone could help me out, it would be great! Thanks
Any idea if this issue has been resolved yet. I am also getting something very similar:
[ bam] writing pseudoalignments to BAM format .. /spin1/swarm/kopardevn/M0tDGHNewa/cmd.10: line 1: 12564 Segmentation fault ( kallisto quant -i mm10_M21 -o TreatmentB_S72 --bias --plaintext
--fusion --rf-stranded -t 56 --pseudobam --genomebam --gtf genes.gtf -c mm10.genome trim/TreatmentB_S72.R1.trim.fastq.gz trim/TreatmentB_S72.R2.trim.fastq.gz )
So, personally, I went with STAR since I was not in a hurry, but someone in another post suggested going back to the older version that works. But frankly, I didn't try it. Also if I remember when I removed the "--pseudobam --genomebam --gtf genes.gtf" and run for example "kallisto quant -i index -o output --single -l 200 -s 20 file1.fastq.gz file2.fastq.gz file3.fastq.gz" it worked.
Very good luck!
keeps happening to me too in kallisto 0.46.2:
[quant] finding pseudoalignments for the reads ...
[quant] done
[quant] processed 250,960,675 reads, 156,761,018 reads pseudoaligned
[ em] quantifying the abundances ... done
[ em] the Expectation-Maximization algorithm ran for 1,513 rounds
[ bam] writing pseudoalignments to BAM format .. [1] 2673 segmentation fault
works when removing the --genomebam
flag, but I'd really like to get the bamfile out of this
When trying to get pesudobam file, it gives me the core-dump error. Machine is Ubuntu server with x86-64 architecture.