JinmiaoChenLab / FastIntegration

FastIntegrate integrates thousands of scRNA-seq datasets and outputs batch-corrected values for downstream analysis
22 stars 4 forks source link

after integration, data slot contains negative values #7

Open Liuzekai666 opened 1 year ago

Liuzekai666 commented 1 year ago

hello, author of this Fastintegration function. I haved heard of your DISCO work and found this suitable for my needs of a fast integration of my own data. However, even I followed your tutorial step by step, I found finally, the values of FeaturePlots always show values like scaled data rather than what you show on your DISCO website, where all values are above zero. I don't know what caused this difference and hope to get your reply! Thank you! image

lmw123 commented 1 year ago

Hello Zekai,

I appreciate your interest in FastIntegration. I wanted to clarify that the gene expression data we provide in DISCO is still in its raw, normalized form. If you have any further questions or need additional information, please don't hesitate to ask.

best, Mengwei

Liuzekai666 commented 1 year ago

If I understand your reply clearly, you mean that even after integrating data from different platfroms, technologies with FastIntegration, the final presentation of expression values on all kinds of plots are still the normalized form of counts matrix of RNA assay of different samples ?

zhengrongbin commented 7 months ago

Dear DISCO developers,

Congratulations for the fantastic database. I got a similar question. Is there any ways to download the batch effect corrected (integrated) gene expression matrix. It looks like there are three ways to download your data: (1) DISCO website - under "Download" tab. (2) DISCO website - under repository tab. (3) through FastIntegration package. Turns out, the h5ad and rds files under "Download" tab only include read counts. I do can find the normalized (float values) data from the rds files downloaded in the way (2) and (3). Could you please possibly point out if the normalized data was log10 normalized read counts, or the CCA normalized data which means corrected by batch effects? I appreciate your comments and looking forward to hearing from you. Thanks!