Open NaotoKubota opened 8 months ago
Hi Naoto,
matrix.mtx counts the number of UMIs uniquely mapping to this junction. So another reason for discrepancy is that there are a lot of multimapping reads that you see in the browser which are not counted in matrix.mtx.
Hi Alex,
I checked the CIGAR of reads mapped to the region and it seems all reads were uniquely mapped (the NH tags were 1). Still unclear why those reads are not counted...
I am also experiencing the same inconsistencies. I am testing STARsolo on a simulated SmartSeq RNA-seq format with no read deduplication, which should not remove exact reads. The total number of uniquely mapped SJ reads in the SJ.out.tab
file does not tally with the total SJ reads in the Solo.out/SJ/raw/matrix.mtx
file. The number of reads in the mtx file is typically 2/3 lesser than the ones in the SJ.out.tab
file.
It would be nice to address this issue as STARsolo has been an amazing tool to use with single cell rnaseq datasets.
Hi Alex,
I am trying to analyze our customized 10x snRNA-seq data to see alternative splicing, but I found junction read counts stored in
matrix.mtx
do not match my observation of bam files in IGV.The execution command is like below:
For example, the junction
chr10 127296634 127296872
(An exclusive junction of the skipped exon shown in the IGV image) seems to have more than 1,000 reads based on the IGV sashimi plot but actually the count inmatrix.mtx
is 1. The bam file shown in IGV is STARsolo output bam after UMI deduplication by UMItools. I found many regions showing such junction count inconsistency.I found a similar issue #2049 and understand the counts in the
matrix.mtx
are after UMI deduplication, but it is still unclear how the junction count matrix is created and why my data show this inconsistency.Do you have any ideas on possible causes?
Best,