Open byqmed opened 6 months ago
Please excuse my late response. It looks to me that there are two batches in B3. I suggest you investigate if there are any biological or experimental reason why B3 is split in two. This likely confused the batch correction algorithm.
Let me know if this didn't help.
Hello, I followed all your steps for the batch correction with the following codes:
dataFolders <- c("RNAseq", "Affymetrix", "Agilent") sources <- c("RNAseq", "affymetrix", "agilent") PATH_TO_DATA_FOLDERS <- "C:/Users/RStudio/GEDI" datasets <- ReadGE(dataFolders, sources, path=PATH_TO_DATA_FOLDERS) hsapiens_attr <- BM_attributes(species="hsapiens") attr <- c("ensembl_gene_id", "affy_hg_u133_plus_2", NA) dat <- GEDI(datasets, attributes=attr, BioMart=TRUE, species="hsapiens", path=PATH_TO_DATA_FOLDERS) pheno <- read.csv("pheno.csv", header=TRUE, row.names=1) summary(as.factor(pheno$batch)) summary(as.factor(pheno$status)) cData <- BatchCorrection(dat, pheno$batch, pheno$status, visualize=TRUE) res <- VerifyGEDI(X=cData, y=pheno$status, batch=pheno$batch, model="logistic")
However, after BatchCorrection, my PCA and RLE plots still look like this
Please help, thank you in advance!