Question about scaling - Githubissues

Hey! Great that this repo is being used before we have started the outline for the community edition of best practices :).

I think the opinions are split on scaling at the moment. Scanpy's initial tutorial followed Seurat's tutorial and thus performed scaling. There was no separate evaluation of what should be done on the side of Scanpy. I would gather that the arguments for and against scaling are:

For: Equal contribution of all genes to PCA or other dimensionality reduction method. Against: Expression level of a gene is indicative of its relative importance.

I'm not sure whether scaling improves the signal-to-noise ratio or not. This is yet to be shown as far as I am aware. In the tutorial I didn't perform scaling as I felt that using an equal weighting for all genes hides some biological signal. Other tutorials, such as that in Slingshot also don't perform scaling.

Basically, there is no best practice suggestion on scaling yes/no therefore it is optional at the moment. If you are keen, you could start a test on this and maybe we could come up with a recommendation?

theislab / sc-best-practices-ce

Question about scaling #3