Closed hyjforesight closed 2 years ago
SoupX does not normalise the count matrix, but does result in non-integer counts which you can think of as representing probabilities that a count occurred (similar to the output from pseudo-aligners).
The file size is larger as more space is required to store a floating point number than an integer.
I suspect that scvelo guesses the data have been normalised when it sees non-integers. I would suggest setting roundToInt=TRUE
when running adjustCounts
which will produce an output count matrix that consists only of integers, which should not confuse downstream tools.
Hello @yihui and @gtca, Thanks for developing this amazing package. I'm wondering what SoupX does for the CellRanger-generated matrix files, because the sizes of SoupX outputs are different from CellRanger's. Moreover, scVelo finds that the
adata X
was normalized by SoupX, not raw data anymore, which cannot be used for RNA velocity analysis. Does SoupX normalize the matrix X? Thanks! Best, YJThe sizes of SoupX outputs are much bigger than CellRanger's outputs.![image](https://user-images.githubusercontent.com/75048821/162795255-ba87e4c9-b453-4f15-8afe-ea1913b41bef.png)
The
adata X
was normalized by SoupX, not raw data anymore. First, let's check the CellRanger raw data.Then, let's check SoupX outputs.