Open SirKuikka opened 2 years ago
Hi,
Sure. Please see the following example code for a simulation.
“
library(nebula)
ng=50
fse=100
sig2=0.05
sig3=0.2
lambda_s = -2
count = c()
fs = rpois(ng,fse)
fs[which(fs==0)] = 1
n = sum(fs)
df = data.frame(num=1:n)
df$x = rnorm(n)
zt = rbinom(ng,1,0.5)
df$z = as.numeric(unlist(sapply(1:ng,function(x) rep(zt[x],fs[x]))))
df$id = as.numeric(unlist(sapply(1:ng,function(x) rep(x,fs[x]))))
df$cell = c(1:nrow(df))
indre = rgamma(ng,shape=1/(exp(sig2)-1),rate=1/(exp(sig2/2)*(exp(sig2)-1)))
df$ref = as.numeric(unlist(sapply(1:ng,function(x) rep(indre[x],fs[x]))))
df$wincell = rgamma(n,shape=(1/sig3),rate=(1/sig3))
df$offset = 1
effcell = 0 ## logFC of the cell-level variable
effsub = 0 ## logFC of the subject-level variable
df$y = rpois(n,exp(lambda_s+effcelldf$x+effsubdf$z+log(df$ref)+log(df$wincell)+log(df$offset)))
pred = cbind(1,df$x,df$z)
re=nebula(df$y, df$id,pred=pred,offset= df$offset,model='NBGMM',method='LN',kappa = 800, cpc=1e-10,verbose=FALSE)
“
Best regards,
Liang
From: SirKuikka @.> Sent: Monday, September 6, 2021 8:00 AM To: lhe17/nebula @.> Cc: Subscribed @.***> Subject: [lhe17/nebula] Data simulation in the original paper (#3)
Hi,
I would very much appreciate if you could show me how you simulated the data in the original paper. The manuscript explains some details, but seeing the R script would make repeating the simulation much easier.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lhe17/nebula/issues/3 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDISUR2XG43KGZUYEMSHOLUASUL5ANCNFSM5DQJTIWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub . https://github.com/notifications/beacon/AGDISUSTJ24BXPX3ZD7FWSTUASUL5A5CNFSM5DQJTIWKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4OXUNRYA.gif
Hi,
Thanks, this helps. I was wondering how exactly did you estimate the type I error for this simulation? The manuscript says: "We then assessed the impact of such biases on controlling the type I error rate by testing the fixed-effects predictors under a wide range of settings." Did you randomly sort the subject or cells into some groups/predictors (e.g. case vs control), and then estimated the error (proportion of false positives) under the assumption that there should be no positive findings in such comparisons? Is there some way to estimate type II error or other metrics for the simulation?
Hi,
To get the empirical type I error rate, we run the example code a large number of times (e.g., K times) under the null hypothesis effsub=0 or effcell=0, depending on which variable you want to test. We then count how many p-values of a variable are smaller than e.g., 0.05. This number divided by K will give you the estimated empirical Type I error rate under the 0.05 level. You can tune the model parameters (ng, fse, size, sig2, …, etc) to simulate under different situations.
Regarding your general statistical questions, please refer to other statistical books or materials. I have no sufficient expertise to answer those questions.
Best regards,
Liang
From: SirKuikka @.> Sent: Monday, September 6, 2021 12:57 PM To: lhe17/nebula @.> Cc: lhe17 @.>; Comment @.> Subject: Re: [lhe17/nebula] Data simulation in the original paper (#3)
Hi,
Thanks, this helps. I was wondering how exactly did you estimate the type I error for this simulation? The manuscript says: "We then assessed the impact of such biases on controlling the type I error rate by testing the fixed-effects predictors under a wide range of settings." Did you randomly sort the subject or cells into some groups/predictors (e.g. case vs control), and then estimated the error (proportion of false positives) under the assumption that there should be no positive findings in such comparisons? Is there some way to estimate type II error or other metrics for the simulation?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lhe17/nebula/issues/3#issuecomment-913771084 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDISUSZNETBQA7WOKPRAHDUATXFXANCNFSM5DQJTIWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub . https://github.com/notifications/beacon/AGDISUROGUQ45A6I6VEIG6LUATXFXA5CNFSM5DQJTIWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOGZ3QUTA.gif
Hi,
So basically effsub=0 means the gene is not DE and if I want DE genes, then I set effsub!=0. That's what I was getting at. Thanks for the help.
Yes, that’s true.
Best regards,
Liang
From: SirKuikka @.> Sent: Wednesday, September 8, 2021 6:53 AM To: lhe17/nebula @.> Cc: lhe17 @.>; Comment @.> Subject: Re: [lhe17/nebula] Data simulation in the original paper (#3)
Hi,
So basically effsub=0 means the gene is not DE and if I want DE genes, then I set effsub!=0. That's what I was getting at. Thanks for the help.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lhe17/nebula/issues/3#issuecomment-915131069 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AGDISUUROCGLHMK5R35QTTLUA46BZANCNFSM5DQJTIWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub . https://github.com/notifications/beacon/AGDISUUZSS6UVI7BBUA4KBTUA46BZA5CNFSM5DQJTIWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOG2F4VPI.gif
Hi,
I would very much appreciate if you could show me how you simulated the data in the original paper. The manuscript explains some details, but seeing the R script would make repeating the simulation much easier.