Open ColeWunderlich opened 1 year ago
Hi Cole,
Cell filtering (a.k.a. cell calling) is not done for the multi-gene outputs. You can do it with a separate STAR command as explained here: https://github.com/alexdobin/STAR/blob/master/docs/STARsolo.md#cell-filtering-of-previously-generated-raw-matrix Another option is to simply use the cell barcodes that were called for unique mappers, which should not make a big difference.
Hey Alex,
Thanks for the reply! I am not necessarily worried about cell calling but rather (final) quantification. The fact you are pointing me toward cell calling, however, may have revealed my misunderstanding. Would you say the following is true (when full EM has been run)?
Solo.out/<method>/raw
matrix.mtx
Is the count matrix of only unique countsUniqueAndMult-EM.mtx
is the count matrix of both unique and multi counts after being put through EM barcodes.tsv
contains the entire contents of the 10x whitelist (many of these will have no counts)Solo.out/<method>/filtered
matrix.mtx
Is the count matrix of only unique counts after cell filtering/callingbarcodes.tsv
contains the results of the cell filtration/calling; all of these have been corrected (passed whitelist)If the above is true, is there a way to get STARsolo to do the filtering on the UniqueAndMult-EM.mtx
? The command you linked to just has the option to supply the raw
directory but not to specify which matrix to use.
I tried making a new folder and using symlinks so that UniqueAndMult-EM.mtx
was renamed to matrix.mtx
but the re-calling didn't work. For some reason it returned only 140 cells despite the normal matrix.mtx
returning ~5k cells.
Hi Cole,
These statements are correct.
CBs for each matrix have been corrected? (not sure on this one) Yes by default, controlled by
--soloCBmatchWLtype
option. Uncorrected CB (or ones that failed to correct) are not reported? (also not sure) They can be reported in the BAM output, but not in the count matrix.
The filtering works in my examples - however, I realized that it outputs the matrix rounded to integers, unlike the original EM matrix that contains non-integer values. I will need to fix this, but at the moment the simplest way is to use the filtered cells based on unique counts, and extract them from the EM matrix.
Hey Alex,
Thanks for getting back to me. I will have to double check how I was trying to filter the EM matrix, but will go with subsetting for now.
Also, just to make sure, the <raw|filtered>/matrix.mtx
contains only unique counts (ie derived from only unique reads) right?
Hi Cole,
Also, just to make sure, the <raw|filtered>/matrix.mtx contains only unique counts (ie derived from only unique reads) right?
That's correct.
Hello,
I was wondering which matrix should be used for final quantification results when running STARsolo with the
EM
mode enabled.I noticed in the
raw
directory there is aUniqueAndMult-EM.mtx
file that has fractional counts which one would expect from EM. In thefiltered
directory, however, there is only one matrix and all of the counts appear to be integer.Does the final
filter
matrix reflect the incorporation of the EM results with some sort of rounding applied? Or is the EM output only reflected in theraw/UniqueAndMult-EM.mtx
and thefilter
results are unique counts only?