Open yuGithuuub opened 4 years ago
Hi,
First of all, we did the benchmark of various batch effects methods, and some methods are applied in raw data genes x cells space, while some other methods only work in feature reduction spaces, so we decided to use PCA embedding matrix as input. Cells in PCA feature reduction spaces are less noisy compared to cells in gene expression matrix space - suppose that PCs can capture almost the main cells characteristics, so it is easy to understand why the output of statistical test is better in reduction space.
Hoa
Hello : Thanks for your wonderful job! I have a question about the input file of kBET algorithm. I noticed that the input file of kBET is the PCA embedding matrix of intergrated object , instead of the cell_feature matrix. So, I tested the following 3 input files.
cell_feature matrix of integrated data seurat_V3_直接用细胞.png.pdf
PCA embedding matrix of intergrated data . seurat_V3_intergrated_PCA.pdf
PCA embedding matrix of Raw data serat_v3_sct.pdf
It looks better to use PCA embedding as the input file. Why is this?