Closed vkartha closed 1 year ago
I passed in the 10X clusters (found in a subdirectory of the 10X outputs like .../count/analysis/clustering/gene_expression_graphclust/clusters.csv
since in the documentation it said the clustering doesn't seem to change the outcomes much. This avoids me doing any processing in R to produce clusters, so I can still do the normal processsing / filtering / clustering etc. after SoupX. I also felt a little uncertain about this step and using the 10X clusters, so I'm curious to hear anything more about this topic.
The default workflow of load10X
and autoEstClust
does exactly as derrik-gratz suggests. In theory, you will get better results if your clusters correspond to well annotated cell types. In practice though, any graph based clustering, such as the one cellranger performs, is good enough.
Hi, I'm interested in testing SoupX out for ambient RNA detection / correction. I was wondering, in most cases we start from gene x "cell" counts, prior to QC filtering / normalization / clustering / cell annotation. I see in your vignettes that it's included providing cluster annotation from the start to aid in SoupX's detection of ambient RNA contamination rates. In most cases, this isn't the norm since it's unprocessed data we wish to flag ambient RNA rates in to begin with? Am I missing something?