satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.31k stars 920 forks source link

Reopening: Perform SCT and then merge while keeping variable features #5205

Closed ms-gx closed 3 years ago

ms-gx commented 3 years ago

Reopening, since my problem is not solved completely.

See for context: #5135

Your solution works in principle. However, it seems that the merge step keeps only the features which are present in all samples for the scale.data slot, right?

Now I have some genes which are highly variable but do not occur in half the samples (viral, I have a custom reference genome including virus). With the above approach they will be ignored for dim reduction since they are not in scale.data.

How about return.only.var.genes = FALSE in SCTransform? But then I can not use the approach you suggested.

What would happen, if I just declare all features as variable and then run dim. reduction and UMAP/clustering? It this problematic from a algorithmic point or is it just computationally more demanding?

Originally posted by @ms-gx in https://github.com/satijalab/seurat/issues/5135#issuecomment-947379690

saketkc commented 3 years ago

You can run SCTransform(return.only.var.genes=False) and get a list of variable features for each of your object by running VariableFeatures(object) on it. Next, to select variable features, you can in principle use the strategy used in SelectIntegrationFeatures to learn the most informative features (which would be used for integration if you were to proceed with that step):

pancreas.list <- lapply(X = pancreas.list, FUN = SCTransform)
features <- SelectIntegrationFeatures(pancreas.list)
VariableFeatures(merged_object) <- features