selbouhaddani / OmicsPLS

R package for High dimensional data analysis and integration with O2PLS!
https://doi.org/10.1186/s12859-018-2371-3
31 stars 8 forks source link

On the model statistics #14

Closed krassowski closed 5 years ago

krassowski commented 5 years ago

R2Xcorr is currently computed as:

https://github.com/selbouhaddani/OmicsPLS/blob/69086e54f6730a33d80d7bfa759a2228112a1636/R/OmicsPLS_o2m.R#L712-L713

I wonder why it is not R2Xcorr <- ssq(Tt %*% t(W)) / ssq(X_true) as it would be suggested by the Table 2 of Evaluation of O2PLS in Omics data integration. I understand that there might be some compensation in the code which would make it equivalent but it eludes my comprehension of the codebase. I would be very grateful if you could hint me on that.

Also, I wanted to thank you for sharing your work and apologize for opening so many issues on GitHub; I can offer help in fixing the minor typos I found if you wish to accept PRs. To my knowledge, this is not only the only open-source package offering O2PLS, but also a well designed and documented one and I hope that I could contribute to make it more bulletproof and be able to use it again in the future!

Edit: I think that some other statistics may require more attention.

selbouhaddani commented 5 years ago

I wonder why it is not R2Xcorr <- ssq(Tt %*% t(W)) / ssq(X_true) as it would be suggested by the Table 2 of Evaluation of O2PLS in Omics data integration.

You can verify that ssq(Tt %*% t(W)) == ssq(Tt), since W and C are orthogonal matrices. However, this does not hold for P_Yosc and P_Xosc. I corrected that in the last version.

I really appreciate your time and effort to improve the package. This is for me the main reason to make an open source package for O2PLS; to have others benefit from the software and to gain understanding myself via issues found by others.