Closed jmodlis closed 6 years ago
Dear Jen, Thanks for your message.
methyl.shared.trans
and rna.shared.trans
datasets? Some things might go wrong when datasets have too few rows to partition in folds.fit <- o2m(methyl.shared.trans, rna.shared.trans, 1, 1, 1)
produce any errors/strange results? Could you print the output of summary(fit)
?loocv_combi(methyl.shared.trans, rna.shared.trans, 1, 1, 1, app_err=F, func=o2m, kcv = 2, stripped = TRUE)
and report output?Thanks in advance for your reply!
Hi Said,
Thanks so much for your response!
The dimensions of methyl.shared.trans are:
> dim(methyl.shared.trans) [1] 23 242714
The dimensions of rna.shared.trans are:
> dim(rna.shared.trans) [1] 23 15846
When I run the fit command, it seems to be ok...
summary(fit) Summary of the O2PLS fit -- Call: o2m(X = methyl.shared.trans, Y = rna.shared.trans, n = 1, nx = 1, ny = 1) -- Modeled variation -- Total variation: in X: 5582422 in Y: 364458 -- Joint, Orthogonal and Noise as proportions: data X data Y Joint 0.046 0.257 Orthogonal 0.046 0.073 Noise 0.908 0.670
-- Predictable variation in Y-joint part by X-joint part: Variation in Yhat relative to U: 0.979 -- Predictable variation in X-joint part by Y-joint part: Variation in Xhat relative to T: 0.979 (cutoff the rest)
Here is the output of loocv_combi(methyl.shared.trans, rna.shared.trans, 1, 1, 1, app_err=F, func=o2m, kcv = 2, stripped = TRUE)
:
loocv_combi(methyl.shared.trans, rna.shared.trans, 1, 1, 1, app_err=F, func=o2m, kcv = 2, stripped = TRUE) Data is not centered, proceeding... Using Power Method with tolerance 1e-10 and max iterations 100 Power Method (comp 1) stopped after 37 iterations. Power Method (comp 2) stopped after 26 iterations. Power Method (comp 1) stopped after 39 iterations.
Data is not centered, proceeding... Using Power Method with tolerance 1e-10 and max iterations 100 Power Method (comp 1) stopped after 31 iterations. Power Method (comp 2) stopped after 27 iterations. Power Method (comp 1) stopped after 31 iterations.
$CVerr [1] 2.010063
$Fiterr [1] NA
I did run scale2 on these datasets before hand to center them and scale the variance, so I'm not sure why it says the data is not centered (?) I am very new to data integration methods, so I wonder if I am missing something simple!
Thanks again, Jen
Hi Jen,
Sorry to keep you waiting!
OK, I think I found the bug, it was in the fitting function for high dimensional data. Can you update the package using devtools::install_github('selbouhaddani/OmicsPLS')
and run your original code again?
I'll send the new version to CRAN tomorrow.
Best, Said
PS in cross-validation a subset of the data is taken. This subset does not have to have mean exactly zero, but that's no problem.
Hi Said,
No worries, you have been very helpful and quick!
When I run the code again, I get values for MSE instead of NA.
crossval_o2m_adjR2(methyl.shared.trans, rna.shared.trans, 1:3, 0:3, 0:3, nr_folds = 2, nr_cores = 4) minimum is at n = 2 Elapsed time: 528.67 sec MSE n nx ny 1 2.012393 1 0 3 2 2.009583 2 0 2 3 2.044003 3 0 3
Thanks again for your help! Jen
Great to hear! Glad that I can close this thread. If you have any more questions/remarks, please let me know.
Hi,
I'm trying to run OmicsPLS on a RNA-Seq and Methyl-array dataset. When I run crossval_o2m_adjR2, I get MSE of "NA". These results do not look valid, especially since there is no n value given. Do you have any insight into what is going on?
Thanks! Jen
Command/output:
crossval_o2m_adjR2(methyl.shared.trans, rna.shared.trans, 1:3, 0:3, 0:3, nr_folds = 2, nr_cores = 4)
minimum is at n = Elapsed time: 570.87 sec MSE n nx ny 1 NA 1 0 3 2 NA 2 0 2 3 NA 3 0 3