Oshlack / missMethyl

Bioconductor package for analysis of methylation data from Illumina's Infinium HumanMethylation arrays.
10 stars 4 forks source link

RUVfit&RUVad usage in 4 samples data set (2 control/2 treatment) #11

Closed Haowen24 closed 1 year ago

Haowen24 commented 2 years ago

Hi, Oshlack, Thanks for developing this wonderful tool for methylation array GO analysis. I have encountered some issues when using the package for analysis of 4 samples (two control, two treatments). Could you please give me some suggestions on how could I work out the code? Thanks a lot!

Here is the code:

----swan---------------------------------------------------------------------

mSetSw <- SWAN(mSet,verbose=TRUE) [SWAN] Preparing normalization subset EPIC [SWAN] Normalizing methylated channel [SWAN] Normalizing array 1 of 4 [SWAN] Normalizing array 2 of 4 [SWAN] Normalizing array 3 of 4 [SWAN] Normalizing array 4 of 4 [SWAN] Normalizing unmethylated channel [SWAN] Normalizing array 1 of 4 [SWAN] Normalizing array 2 of 4 [SWAN] Normalizing array 3 of 4 [SWAN] Normalizing array 4 of 4

----betasByType, fig.cap = "Beta value dustributions. Density distributions of beta values before and after using SWAN.", echo = TRUE, fig.width=10, fig.height=5----

par(mfrow=c(1,2), cex=1.25) densityByProbeType(mSet[,1], main = "Raw") densityByProbeType(mSetSw[,1], main = "SWAN")

----filtering----------------------------------------------------------------

detP <- detectionP(rgSet) keep <- rowSums(detP < 0.01) == ncol(rgSet) mSetSw <- mSetSw[keep,]

----extraction---------------------------------------------------------------

set.seed(10) mset_reduced <- mSetSw[sample(1:nrow(mSetSw), 20000),] meth <- getMeth(mset_reduced) unmeth <- getUnmeth(mset_reduced) Mval <- log2((meth + 100)/(unmeth + 100)) beta <- getBeta(mset_reduced) dim(Mval) [1] 20000 4

----mdsplot, fig.cap = "MDS plot. A multi-dimensional scaling (MDS) plot of cancer and normal samples.", echo = TRUE, fig.small=TRUE----

par(mfrow=c(1,1)) plotMDS(Mval, labels=targets$Sample_Name, col=as.integer(factor(targets$Sample_Group))) legend("topleft",legend=c("Control","NEN"),pch=16,cex=1.2,col=1:2)

----design-------------------------------------------------------------------

group <- factor(targets$Sample_Group,levels=c("Control","NEN")) design <- model.matrix(~group) design (Intercept) groupNEN 1 1 0 2 1 0 3 1 1 4 1 1 attr(,"assign") [1] 0 1 attr(,"contrasts") attr(,"contrasts")$group [1] "contr.treatment"

----diffmeth-----------------------------------------------------------------

fit.reduced <- lmFit(Mval,design) fit.reduced <- eBayes(fit.reduced, robust=TRUE)

----diffmeth-results---------------------------------------------------------

summary(decideTests(fit.reduced)) (Intercept) groupNEN Down 8269 0 NotSig 739 20000 Up 10992 0 top<-topTable(fit.reduced,coef=2) top logFC AveExpr t P.Value adj.P.Val B cg01070657 1.0242676 5.053866 10.473365 0.0001473203 0.2775566 -0.01251911 cg22108713 0.9440847 -4.789453 9.405889 0.0002450956 0.2775566 -0.18369957 cg03883941 0.8267712 -4.624774 8.473417 0.0003999418 0.2775566 -0.37276517 cg16764494 0.7690931 -4.092296 8.129353 0.0004850494 0.2775566 -0.45417999 cg10779718 0.7621927 -1.398119 8.079786 0.0004990157 0.2775566 -0.46649762 cg25433188 0.9236696 -3.156027 8.185205 0.0004896679 0.2775566 -0.46995099 cg15559217 0.7335377 2.949935 7.897167 0.0005547747 0.2775566 -0.51323517 cg25324047 0.7888124 4.819406 7.711857 0.0006190966 0.2775566 -0.56293487 ch.4.172823451R 0.7862063 -3.022995 7.664116 0.0006370797 0.2775566 -0.57612587 cg20569048 -0.8841992 -4.686046 -7.613306 0.0006915296 0.2775566 -0.62726209

----top4, fig.cap = "Top DM CpGs. The beta values for the top 4 differentially methylated CpGs.", echo = TRUE, fig.width=10,fig.height=9----

cpgs <- rownames(top) par(mfrow=c(2,2)) for(i in 1:4){

  • stripchart(beta[rownames(beta)==cpgs[i],]~design[,2],method="jitter",
  • group.names=c("Control","NEN"),pch=16,cex=1.5,col=c(4,2),ylab="Beta values",
  • vertical=TRUE,cex.axis=1.5,cex.lab=1.5)
  • title(cpgs[i],cex.main=1.5)
  • }

----diffmeth2----------------------------------------------------------------

get M-values for ALL probes

meth <- getMeth(mSet) unmeth <- getUnmeth(mSet) M <- log2((meth + 100)/(unmeth + 100))

setup the factor of interest

grp <- factor(targets$Sample_Group, labels=c(0,1))

extract Illumina negative control data

INCs <- getINCs(rgSet) head(INCs) 205378030109_R01C01 205378030109_R02C01 205378030109_R05C01 205378030109_R06C01 10609447 -1.4482044 -0.5213464 -0.62403207 -0.61487808 10627500 -0.9267510 -1.0202724 -0.61585651 0.10236172 10676356 0.3069778 -0.6602506 0.01207283 0.57835977 10731326 1.6951454 0.7548875 2.03747471 1.34958444 10732387 -0.6615838 0.3434842 0.53981046 0.04466905 10740401 -1.5849625 -0.1530116 -1.81249823 -2.61470984

add negative control data to M-values

Mc <- rbind(M,INCs)

create vector marking negative controls in data matrix

ctl1 <- rownames(Mc) %in% rownames(INCs) table(ctl1) ctl1 FALSE TRUE 866238 411 rfit1 <- RUVfit(Y = Mc, X = grp, ctl = ctl1) # Stage 1 analysis Error in invvar(Y, ctl, XZ = cbind(X, Z), invsvd = invsvd) : In function invvar: You are running the inverse algorithm on fewer than four degrees of freedom. This is not currently supported, and very likely inadvisable as well. A possible work-around, if you are using RUVinv and you are sure you want to do this, is to set randomization=TRUE.

########### if I added randomization=TRUE to the RUVfit, it worked. But I still could not get rfit2 work.

rfit1 <- RUVfit(Y = Mc, X = grp, ctl = ctl1, randomization=TRUE) # Stage 1 analysis rfit2 <- RUVadj(Y = Mc, fit = rfit1) Error in if (temp[[4]] > 0) df = df + temp[[4]] else df = Inf : missing value where TRUE/FALSE needed

Could you please let me know how should i modify the code? ############## I am a fresh man in bioinformatic analysis, any help is appreciated.

Best, Haowen

JovMaksimovic commented 1 year ago

Hi Haowen, I know this reply is a bit late, but since you have so few samples (2 vs. 2), I recommend trying "ruv4" instead of the default "ruvinv" for RUVfit. I suggest you also use k=1. See the example in the vignette in section 7.2: https://bioconductor.org/packages/release/bioc/vignettes/missMethyl/inst/doc/missMethyl.html#removing-unwanted-variation-when-testing-for-differential-methylation

Haowen24 commented 1 year ago

Thank you very much.

Get Outlook for iOShttps://aka.ms/o0ukef


From: JovMaksimovic @.> Sent: Tuesday, March 21, 2023 8:32:53 PM To: Oshlack/missMethyl @.> Cc: Haowen24 @.>; Author @.> Subject: Re: [Oshlack/missMethyl] RUVfit&RUVad usage in 4 samples data set (2 control/2 treatment) (Issue #11)

Hi Haowen, I know this reply is a bit late, but since you have so few samples (2 vs. 2), I recommend trying "ruv4" instead of the default "ruvinv" for RUVfit. I suggest you also use k=1. See the example in the vignette in section 7.2: https://bioconductor.org/packages/release/bioc/vignettes/missMethyl/inst/doc/missMethyl.html#removing-unwanted-variation-when-testing-for-differential-methylationhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbioconductor.org%2Fpackages%2Frelease%2Fbioc%2Fvignettes%2FmissMethyl%2Finst%2Fdoc%2FmissMethyl.html%23removing-unwanted-variation-when-testing-for-differential-methylation&data=05%7C01%7C%7Ca015586da1d74e8abdaa08db2a861eff%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638150527756194785%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=RKZXYI01sz%2B0OiPK7z2gyOiiTesdsLl0dNGn9PFj2rw%3D&reserved=0

— Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FOshlack%2FmissMethyl%2Fissues%2F11%23issuecomment-1478874311&data=05%7C01%7C%7Ca015586da1d74e8abdaa08db2a861eff%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638150527756204726%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Jrbk5IxHQt9ULxobuV27dnmnqHaU8dS%2Fiw%2FEETzkf7s%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAWU6P3DDJASQXKBG6IQU47TW5JXGLANCNFSM5IYBULWQ&data=05%7C01%7C%7Ca015586da1d74e8abdaa08db2a861eff%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638150527756204726%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YXpPLK43eRRBTJf74mmEOMxppDoTyG1YIkSE7sZR30A%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>