question regarding the tutorial...

In R_block_11, why the variable used in y axis is perb.rare@matrix, I would think if the expected rho value should be plotted rho.rare@matrix should be used instead.

Also, I was a bit concern about the great difference between the point estimator of rho and the expected value of rho, when inquiring correlation of the same OTUs used in R_block_10.. Please see attachment, while point estimators of rho reached proportionalities over 0.95 in many instances, the estimated values reached at maximum ~0.8. I was expecting differences but not so notable ones.

I also tied to investigate the range of correlations obtaining by every different approximation (see attached). As it is also observed on Figure 8, correlation coefficients from E(rho) and pearson tend to be smaller than those obtained by the point estimate (see attached the hist of the sd for the whole dataset), I also saw similar behavior when using in-house datasets. I cannot really formulate a reason for this observation, or even understand which should be consider the "gold standard". dist_rho.pdf correlations_prop.txt

You are correct, I should have plotted perb.rare@matrix instead. Thank you for pointing out the error. I have updated the figure and the code. Note that, in reference to you second point, that the E(rho) for 11 vs 37246 is now even lower.

The point estimate includes much more random noise than does the expected value. Thus, the E(rho) value is more robust. One way to think about it is that the E(rho) is the value that you would observe if you could have averaged the values across multiple technical replicates. When collecting associations, the investigator is interested in those that are not affected by random sampling, but that would have been observed upon replication

Hi Erfan

Could you supply your R script as well? I do not understand where the issue is thanks

Greg Gloor, PhD Professor and Chair, Department of Biochemistry Schulich School of Medicine & Dentistry University of Western Ontario email: @.*** twitter: @gbgloor

On Oct 11, 2021, at 3:19 AM, erfanshek @.***> wrote:

Dear Professor Gloor,

Thank you for your review and clarifying the statistical dimension of Amplicon analysis. Although I'm following through with most concepts, when it comes to the adaptation of the R code, I'm having some problems.

Simply running the R code with the sample data does not result in reproducible graphs. There are many errors that I've encountered throughout. I'm presuming that this is due to the updated arrangements of R and the affiliated packages.

I would love to implement the novel approaches mentioned in your review paper in my new research paper, however, I'm finding it quite challenging to implement since the code will not even work for your provided dataset, let alone my input that might be slightly different.

The ALDEx2 graphs seem to be working just fine and are reproducible when it comes to the Review Paper. However, problems arise when

• Running f.t <- aldex.ttest(f.x, conds) • Doing PCA Plots. Would very much appreciate if an updated R script could be implemented for the newest versions of R.

• I've attached my RStudio console errors, objects, and graphs(https://drive.google.com/drive/folders/1Ugl3v9XP0p5AhdTgoSprcXIMbsOPDFzu?usp=sharing) My versions are:

• R 4.1.1 • RStudio 1.4.1717 • ALDEx2 1.24.0 • car 3.0-11 carData 3.0-4 • CoDaSeq 0.99.6 • zCompositions 1.3.4 • igraph 1.2.6 • grDevices 4.1.1 • propr 4.2.6 • vegan 2.5-7 Best, Erfan

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

Hi Erfan Could you supply your R script as well? I do not understand where the issue is thanks Greg Gloor, PhD Professor and Chair, Department of Biochemistry Schulich School of Medicine & Dentistry University of Western Ontario email: @. twitter: @gbgloor … On Oct 11, 2021, at 3:19 AM, erfanshek @.> wrote: Dear Professor Gloor, Thank you for your review and clarifying the statistical dimension of Amplicon analysis. Although I'm following through with most concepts, when it comes to the adaptation of the R code, I'm having some problems. Simply running the R code with the sample data does not result in reproducible graphs. There are many errors that I've encountered throughout. I'm presuming that this is due to the updated arrangements of R and the affiliated packages. I would love to implement the novel approaches mentioned in your review paper in my new research paper, however, I'm finding it quite challenging to implement since the code will not even work for your provided dataset, let alone my input that might be slightly different. The ALDEx2 graphs seem to be working just fine and are reproducible when it comes to the Review Paper. However, problems arise when • Running f.t <- aldex.ttest(f.x, conds) • Doing PCA Plots. Would very much appreciate if an updated R script could be implemented for the newest versions of R. • I've attached my RStudio console errors, objects, and graphs(https://drive.google.com/drive/folders/1Ugl3v9XP0p5AhdTgoSprcXIMbsOPDFzu?usp=sharing) My versions are: • R 4.1.1 • RStudio 1.4.1717 • ALDEx2 1.24.0 • car 3.0-11 carData 3.0-4 • CoDaSeq 0.99.6 • zCompositions 1.3.4 • igraph 1.2.6 • grDevices 4.1.1 • propr 4.2.6 • vegan 2.5-7 Best, Erfan — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

Dear Prof. Gloor,

I've supplied my R script to the link above as well. In the aldex.ttest(f.x, conds), the code will give the following error:

Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'which': object 'f.t' not found

Therefore, I've added an argument paired.test = FALSE to solve the problem temporarily. When running the data for the sample PCA plots, I only see 3 points represented. I went over the different tables and saw that in the code line...

E.clr <- t(apply(exp, 2, function(x) log2(x) - mean(log2(x)) ))

... all values turn into 0 counts, which might be cause of the problem. Still not sure because the applied formula is correct. Overall running the unmodified code itself from the review paper is not reproducing the PCA biplots and plots, so I'm unsure of the correct input format I must have for my own personalized data.

Thank you for your prompt response and if you have anymore questions feel free to ask.

Best,

Erfan

ggloor / Frontiers_2017

question regarding the tutorial... #1