Open sbresnahan opened 2 months ago
Hi, Sean, since you are processing bulk, it should only print out processing sample/cell 0
; is this the case? Can you please post the full output?
If I run with --threads=1
, it is indeed only processing sample/cell 0
before the seg fault:
[index] k-mer length: 63
[index] number of targets: 252,723
[index] number of k-mers: 157,178,936
[index] number of equivalence classes loaded from file: 327,292
[tcc] Parsing transcript-compatibility counts (TCC) file as a matrix file
[tcc] Matrix dimensions: 72 x 327,292
[quant] Running EM algorithm...
[ em] reading priors from file ONT
[quant] Processing sample/cell 0
/home/stbresnahan/.lsbatch/1727389319.16590285.shell: line 39: 55903 Segmentation fault (core dumped) kallisto quant-tcc -t 1 --long -p ONT -f ${DIR_OUT}/flens.txt -i kallisto_index/gencode_v45 -e ${DIR_OUT}/count.ec.txt -o ${DIR_OUT}/quant-tcc ${DIR_OUT}/count.mtx
However, if I set --threads
to anything other than 1 (in this case, 12), it is:
[index] k-mer length: 63
[index] number of targets: 252,723
[index] number of k-mers: 157,178,936
[index] number of equivalence classes loaded from file: 327,292
[tcc] Parsing transcript-compatibility counts (TCC) file as a matrix file
[tcc] Matrix dimensions: 72 x 327,292
[quant] Running EM algorithm...
[ em] reading priors from file ONT
[quant] Processing sample/cell 0quant] Processing sample/cell [quant] Processing sample/cell 2[quant] Processing sample/cell [quant] Processing sample/cell quant] Processing sample/cell 5
[quant] Processing sample/cell 3[quant] Processing sample/cell [quant] Processing sample/cell 6
[quant] Processing sample/cell 4
[quant] Processing sample/cell 77
[quant] Processing sample/cell 88
[[[
quant] Processing sample/cell 11
[quant] Processing sample/cell 9quant] Processing sample/cell [quant] Processing sample/cell 11uant] Processing sample/cell [quant] Processing sample/cell [quant] Processing sample/cell 1
0
0
/home/stbresnahan/.lsbatch/1727384386.16588742.shell: line 38: 3476442 Segmentation fault (core dumped) kallisto quant-tcc -t 12 --long -p ONT -f ${DIR_OUT}/flens.txt -i kallisto_index/gencode_v45 -e ${DIR_OUT}/count.ec.txt -o ${DIR_OUT}/quant-tcc ${DIR_OUT}/count.mtx
This occurs regardless of whether I start the process with a single .fastq or multiple .fastq files.
@sbresnahan can you post the exact commands you’re running?
And can you try the official binaries on the Releases page to make sure it’s not a compilation error?
@sbresnahan can you post the exact commands you’re running?
And can you try the official binaries on the Releases page to make sure it’s not a compilation error?
Building transcriptome index:
gffread -F -w GCA_000001405.15_GRCh38_no_alt_analysis_set_gencode_v45.fasta \
-g GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
gencode.v45.annotation.gtf
kallisto index -k 63 -t 10 -i gencode_v45 GCA_000001405.15_GRCh38_no_alt_analysis_set_gencode_v45.fasta
Running lr-kallisto:
kallisto bus -t 8 --long --threshold 0.8 -x bulk -i gencode_v45 \
-o kallisto_out fullLength.and.rescued.fastq
bustools sort -t 8 kallisto_out/output.bus \
-o kallisto_out/sorted.bus
bustools count kallisto_out/sorted.bus \
-t kallisto_out/transcripts.txt \
-e kallisto_out/matrix.ec \
-g kallisto_out/gencode_v45_tx2g.tsv \
-o kallisto_out/count --cm -m
kallisto quant-tcc -t 8 \
--long -p ONT -f kallisto_out/flens.txt \
-i kallisto_index/gencode_v45 \
-e kallisto_out/count.ec.txt \
-o kallisto_out/quant-tcc \
--matrix-to-files \
kallisto_out/count.mtx
I will try the linked binary and get back to you.
I do get a similar error line 76: 5394 Segmentation fault
I have tried both compiling myself and using the @Yenaled
[index] k-mer length: 63
[index] number of targets: 385,659
[index] number of k-mers: 186,649,435
[index] number of equivalence classes loaded from file: 193,836
[tcc] Parsing transcript-compatibility counts (TCC) file as a matrix file
[tcc] Matrix dimensions: 1 x 193,836
[quant] Running EM algorithm...
[ em] reading priors from file ONT
[quant] Processing sample/cell 0
/var/spool/slurm/job23490963/slurm_script: line 76: 5394 Segmentation fault (core dumped) $SCRATCH/bioinformatic_tools/kallisto/kallisto/kallisto_linux-v0.51.1_kmer64 quant-tcc --long -p ONT -t $SLURM_CPUS_PER_TASK -i "$INDEX_PATH" -o "$OUTPUT_DIR/$SAMPLE_NAME" --matrix-to-files -f "$OUTPUT_DIR/$SAMPLE_NAME/flens.txt" -e "$OUTPUT_DIR/$SAMPLE_NAME/count.ec.txt" "$OUTPUT_DIR/$SAMPLE_NAME/count.mtx"
Is this an issue mainly with with `v0.51.1?
Very strange — quant-tcc seems to have issues with the input files supplied. If you are able to upload the files somewhere (the files supplied to quant-tcc) and email them to me, I can help debug.
Sorted it
It was the -p
I was reading https://pachterlab.github.io/kallisto/manual where the -p
was for platform while actually its -P
for platform
Oh good catch! And yay!
Version:
kallisto 0.51.1
I'm following a workflow outlined in issue 456 for using lr-kallisto with bulk ONT.
kallisto bus
,bustools sort
, andbustools count
steps complete without errors. However, thekallisto quant-tcc
step is being dumped by LSF with554689 Segmentation fault
shortly afterprocessing sample/cell N
.I'm using a kallisto index with kmer-length=63 built from transcripts pulled from the GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta and gencode v45 gtf using gffread. An index built from these transcripts with kmer-length=31 have no issues with
kallisto quant
using short reads.