Closed ms-gx closed 3 years ago
You can run SCTransform(return.only.var.genes=False)
and get a list of variable features for each of your object by running VariableFeatures(object)
on it. Next, to select variable features, you can in principle use the strategy used in SelectIntegrationFeatures to learn the most informative features (which would be used for integration if you were to proceed with that step):
pancreas.list <- lapply(X = pancreas.list, FUN = SCTransform)
features <- SelectIntegrationFeatures(pancreas.list)
VariableFeatures(merged_object) <- features
Reopening, since my problem is not solved completely.
See for context: #5135
Your solution works in principle. However, it seems that the merge step keeps only the features which are present in all samples for the
scale.data
slot, right?Now I have some genes which are highly variable but do not occur in half the samples (viral, I have a custom reference genome including virus). With the above approach they will be ignored for dim reduction since they are not in
scale.data
.How about
return.only.var.genes = FALSE
inSCTransform
? But then I can not use the approach you suggested.What would happen, if I just declare all features as variable and then run dim. reduction and UMAP/clustering? It this problematic from a algorithmic point or is it just computationally more demanding?
Originally posted by @ms-gx in https://github.com/satijalab/seurat/issues/5135#issuecomment-947379690