STOmics / SAW

GNU General Public License v3.0
136 stars 34 forks source link

Number of MMP reads exceeds the threshold #125

Closed reyvnth closed 1 month ago

reyvnth commented 3 months ago

Hi,

While I was running SAW_v7.1 on my data, I encountered an error with regards to the number of MMP reads exceeding threshold. Please find an example script and error below. Let me know if you need anything else to help me troubleshoot.

Best, Revanth

Scripts I ran:

ulimit -n 10240 ulimit -v 33170449147 NUMBA_CACHE_DIR= x/ x/

dataDir=~/data outDir=~/outs

export APPTAINER_BIND=$dataDir,$outDir

Choose from the following scenarios

bash stereoPipeline_v7.1.sh \ -sif ~/SAW_v7.1.sif \ -splitCount 1 \ -maskFile ~/x.h5 \ -fq1 $dataDir/file1_1.fq.gz,file2_1.fq.gz,........,file8_1.fq.gz \ -fq2 $dataDir/file1_2.fq.gz,file2_2.fq.gz,......,file8_2.fq.gz\ -speciesName x \ -tissueType x \ -refIndex ..../STAR_SJ100 \ -annotationFile .../genes.gtf \ -rRNAremove : N \ -threads 16 \ -outDir $outDir \ -imageRecordFile ~/x.ipr \ -imageCompressedFile ~/x.tar.gz \ -doCellBin Y

Error: --- Error: The number of MMP reads exceeds the threshold. bcSTAR: ReadsParse.cpp:2328: BGI::FMindex::findMMPs(uint64_t, char ()[128], meta, mmp, status, task, stage, BGI::SAindex, bool)::<lambda(uint64_t, status, task, meta, mmp)>: Assertion `false' failed. /usr/local/bin/mapping: line 1: 260333 Aborted (core dumped) /opt/saw_st_software/pipeline/mapping/bcSTAR $ Command exited with non-zero status 134 Command being timed: "apptainer exec /pl/active/CIQILab_RC/STomics/SAW_v7.1.sif mapping --outSAMattributes spatial --outSAMtype BAM SortedByCoordinate --genomeDir /pl/active/CIQILab_RC/STomics/genome_ref/STAR_SJ100 --runThreadN 16 --outFileNamePrefix /pl/active/CIQILab_RC/STomics/outs/D6/00.mapping/E150020157_L01_57_1. --sysShell /bin/bash --stParaFile /pl/active/CIQILab_RC/STomics/outs/D6/00.mapping/E150020157_L01_57_1.bcPara --readNameSeparator " " --limitBAMsortRAM 1099511627776 --limitOutSJcollapsed 10000000 --limitIObufferSize=280000000 --outBAMsortingBinsN 50 --outSAMmultNmax 1" User time (seconds): 116.87 System time (seconds): 6.82 Percent of CPU this job got: 144% Elapsed (wall clock) time (h:mm:ss or m:ss): 1:25.71 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 34832328 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 94 Minor (reclaiming a frame) page faults: 382502 Voluntary context switches: 1013966 Involuntary context switches: 252 Swaps: 0 File system inputs: 6196288 File system outputs: 72 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 134

Clouate commented 3 months ago

Hi, this problem is often due to the fact that too many seeds are found during the mapping of reads, which is related to the STAR mapping algorithm. You could try to use another genome reference.

reyvnth commented 1 month ago

Thanks for the clarification!