Closed cypranowska closed 4 years ago
Can you show the contents of the batch file? How soon does this happen in the process, right after starting the program or after some time? Does this still happen if you use only a part of the batch file, say the first two lines?
I tried again with just the first two lines, and I get the same error. The error usually happens ~5 seconds after starting the program. I've attached my batch file here. I'm doing this on a computing cluster, and not on a Linux box, if that info helps at all.
Dear @pmelsted,
I have 6 samples each with a different number of cells. Can I add all files R1 and R2 in a batch for the kallisto bus
command? douse that will be counted as a separated input?
In the next command bustools correct
I will prepare a whitelist with all possible combinations of cell barcodes.
Thanks, HM
@cypranowska Can you check if any of the reads are of zero length, this can happen with trimming and trips kallisto up.
You can use the following awk command to count the number of blank lines in your fastq files
zcat file.fastq.gz | awk '/^$/ {x+=1} END {print x}'
@hmassalha pseudo is not compatible with bus, please use the mailing list for information on how to process the data.
@pmelsted I looked at my .fastq.gz files and there aren't zero length reads in any of them.
I have just experienced the seemingly same problem.
% ~/kallisto-0.46.0/build/src/kallisto pseudo -i ~/kallisto-0.46.0/mm10/transcripts.idx -o output -b batch1.txt
[quant] fragment length distribution will be estimated from the data [index] k-mer length: 31 [index] number of targets: 41,604 [index] number of k-mers: 66,826,372 [index] number of equivalence classes: 99,573 [quant] running in paired-end mode [quant] will process pair 1: ./R1/E10_2_1_R1.gz ./R2/E10_2_1_R2.gz [quant] will process pair 1: ./R1/E10_2_10_R1.gz ./R2/E10_2_10_R2.gz [quant] will process pair 1: ./R1/E10_2_11_R1.gz ./R2/E10_2_11_R2.gz [quant] finding pseudoalignments for all files ...Segmentation fault (コアダンプ)
% ~/kallisto-0.46.0/build/src/kallisto pseudo -i ~/kallisto-0.46.0/mm10/transcripts.idx -o output ./R1/E10_2_1_R1.gz ./R2/E10_2_1_R2.gz
[quant] fragment length distribution will be estimated from the data [index] k-mer length: 31 [index] number of targets: 41,604 [index] number of k-mers: 66,826,372 [index] number of equivalence classes: 99,573 [quant] running in paired-end mode [quant] will process pair 1: ./R1/E10_2_1_R1.gz ./R2/E10_2_1_R2.gz [quant] finding pseudoalignments for the reads ... done [quant] processed 1,521,470 reads, 1,221,252 reads pseudoaligned
% more batch1.txt E10_2_1 ./R1/E10_2_1_R1.gz ./R2/E10_2_1_R2.gz E10_2_10 ./R1/E10_2_10_R1.gz ./R2/E10_2_10_R2.gz E10_2_11 ./R1/E10_2_11_R1.gz ./R2/E10_2_11_R2.gz
Segfaults occur only when batch mode is employed. I have performed the above process on CentOS Linux release 7.2.1511 (Core) I have tried pre-compiled 0.45.0, 0.46.0 source compiled 0.46.0 But I could not avoid the segfault. When using single-end fastq file, segfaults does not occur only when 0.46.0 is used.
This issue has been fixed in the development branch.
I'm getting a segmentation fault when I run
pseudo
with the -b switch, and am not sure why. For example, when I runkallisto pseudo -i ../ref/drosophila_melanogaster/transcriptome.idx -o ../test -b ../ref/test_batch.txt
, I get the following out:But if I provide the first pair of .fastq files at the command line, instead of in the batch file, I'll get the expected output. I'd like to avoid looping through all of my samples in my batch script if I can. I'm using version 0.46.0.