Closed MarkCBitter closed 5 years ago
poke @mblumuga
Hi @MarkCBitter, that is a good point, we do not have argument to indicate that there are biological replicates. However, when looking at the scree plot of the eigenvalues, you should be able to find that the correct value of the number of PCs K is smaller than the number of treatments * number of replicates. K should be equal to the number of treatments - 1 whatever the number of replicates per treatment you have. Is that what you have found?
Hi @mblumuga ,
Thank you very much for the quick reply on this. If I understand you correctly, this is not what I have found. When looking at the scree plot, K is simply equal to the total number of treatments * number of replicates -1. For example, in a case with two treatments, each with 3 replicates, K is 5 (though from your response K should be 1 in this case).
Please let me know if attaching any PC or scree plots would be helpful.
Indeed, these plots would be really helpful.
Below is one example with three "treatments" (to my understanding K should be 2 in this case). The replicates are color coded and the "Embryo" sample has no replicate. It is clear that the replicates (D6High_Small and D6Low_Small) are not very tightly coupled, which I now suspect may be driving the incorrect K value.
In the above example, PC 1 explains 28% of the variation and PC2 explains 25%
K=2 is the good option for you. PC1 corresponds to differentiation between blue vs the rest. PC2 corresponds to differentiation between green and red.
I would also recommend to use environmental association analysis (e.g. the software LFMM) where you use an environnemental variable with 3 levels that correspond to the 3 colors of your graph. You can merge the 2 scans using meta-analysis (Fisher method) or Venn diagram.
I will try this. Thank you @mblumuga!
Hello,
I am using pcadapt to analyze pooled sequencing data from a selection experiment. I am currently running analyses with one biological replicate for each treatment, as I can not seem to find anywhere in the documentation that indicates where one can inform pcadapt that some of the columns in the dataframe are actually biological replicates.
Is there any way to incorporate this information? Thank you very much.
Best, Mark