Open ekopylova opened 3 years ago
Hi Jenya,
You assumptions are correct that if correctBatch has no files, then there are not potential batches. You can check this using the CI plot from FindBatch folder.
Hari cc'd could help you further.
Can you share a snapshot of FindBatch results for Hari to help?
Thanks for your interest,
Anguraj
On Thu, Feb 25, 2021 at 11:02 AM Evguenia Kopylova notifications@github.com wrote:
Hello!
SPLS-DA and PLS-DA models built on our data + possible known batch effect show the models can accurately classify samples based to one of two batches (pR2Y = 0.05 & pQ2 = 0.05 based on 20 random permutations of response labels to estimate R2Y and Q2Y significance). However, expBATCH finished after findBATCH and I'm assuming because the findBATCH function reported the batch as not significant. Could you comment on whether we should still remove the batch, even though it's not classified as significant by findBATCH ?
Thanks! Jenya
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/syspremed/exploBATCH/issues/7, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACEXFITTCTQE6ES4C7KLNBLTAYU3JANCNFSM4YGJFGPQ .
Thanks @gnyamundanda for the quick response!
Running PLS-DA using the batch as the response (through ropls
package), I obtain the following results:
PLS-DA
60 samples x 700 variables and 1 response
standard scaling of predictors and response(s)
2 excluded variables (near zero variance)
R2X(cum) R2Y(cum) Q2(cum) RMSEE pre ort pR2Y pQ2
Total 0.0858 0.948 0.565 0.102 2 0 0.05 0.05
However, running the same data through expBATCH
, there appears not to be a significant effect:
batchEffect.txt
"Effect" "LowerCI" "UpperCI"
"pPC-1" -0.076615559930736 -1.09629147519899 0.943060355337517
"pPC-2" -3.14523049034484 -7.1707748451274 0.880313864437718
BioEFFECT.txt
"Effect" "LowerCI" "UpperCI"
"pPC-1" -3.30722887887888 -5.19626149735946 -1.41819626039831
"pPC-2" 0.961300863108799 -0.0199433882563451 1.94254511447394
To add, does it matter if the feature table we are working with is sparse? This is microbiome shotgun data, rather than gene expression, and includes many 0's. Thanks.
Hello!
SPLS-DA and PLS-DA models built on our data + possible known batch effect show the models can accurately classify samples based to one of two batches (pR2Y = 0.05 & pQ2 = 0.05 based on 20 random permutations of response labels to estimate R2Y and Q2Y significance). However,
expBATCH
finished afterfindBATCH
and I'm assuming because thefindBATCH
function reported the batch as not significant. Could you comment on whether we should still remove the batch, even though it's not classified as significant byfindBATCH
?Thanks! Jenya