theislab / scib-reproducibility

Additional code and analysis from the single-cell integration benchmarking project
https://theislab.github.io/scib-reproducibility/
MIT License
53 stars 13 forks source link

What does scaling refere to #19

Closed flde closed 2 years ago

flde commented 2 years ago

Many thanks for the great paper and for making the source code available. I have a rather basic question but hope you can help me with it: What do you refere to when saying the data are scaled (yikes)?

The count data are normalized by scran pooled size factor and log10+1 transformed. That is what I find in the paper and the sample pre-processing notebooks. But for me scaling refers to zero mean and unit variance. Were the samples eventually scaled (z-scored) prior integration? Or does scaling refer to scaled by the size factor?

Sorry for the confusion and many thanks!

LuckyMD commented 2 years ago

Hi @flde,

Scaling means that we applied sc.pp.scale() per batch prior to integration. The datasets were prepared to not be scaled as shown in the notebooks. Scaling is a variable that is tested in the benchmarking snakemake pipeline in scib-pipeline.

I hope that helps!

flde commented 2 years ago

Hello @LuckyMD, many thanks for the clarification. I get a bit paranoid about termenology sometimes.
All the best!