ChiLiubio / microeco

An R package for data analysis in microbial community ecology
GNU General Public License v3.0
195 stars 56 forks source link

NST, betaNTI and RCbray #326

Closed DpennaS closed 7 months ago

DpennaS commented 7 months ago

Hello!

I have a few doubts about the use of these analysis in my dataset. I am trying to understand the community assembly of my dataset.

First, I ran a NST analysis:

x <- trans_nullmodel$new(ps_bromeliad_microeco)

x$cal_NST(method = "tNST", group = "Local", dist.method = "bray", abundance.weighted = TRUE, output.rand = TRUE, SES = TRUE)

x$cal_NST_test(method = "nst.boot")

stoc_bray <- x$res_NST$index.grp

nst_bray <- stoc_bray %>% mutate(NST.i.bray = NST.i.bray*100) %>% ggplot(aes(x=group, y=NST.i.bray)) + geom_point(aes(color = group), size = 10)

Almost every site (Local) in the dataset had NST > 50%

Then I ran a betaNTI and RCbray to quantify the processes and revalidate the stochasticity results that i had

tmp <- "./nti"; dir.create(tmp)

x$cal_ses_betamntd(runs = 999, abundance.weighted = TRUE, use_iCAMP = TRUE, iCAMP_tempdir = tmp)

x$cal_rcbray(runs = 999)

x$cal_process(use_betamntd = TRUE)

x$res_process

I got this results

variable selection - 2.5 homogeneous selection - 58.19 dispersal limitation - 0 homogeneous dispersal - 0 drift - 39.26

So, according to these results the major processes are deterministic

I also ran a betaNTI and RCbray for each site and the results were more different, there site with higher NST value had also the higher homogeneous selection and least drift.

I am opening this issue to see if I have an error in my code, or an error in the package codes or is a deeper ecological question to answer. I know these models are constructed in different ways, but i have already seen people comparing the results and they were kinda similar.

Thanks for the attention, Penna

ChiLiubio commented 7 months ago

Hi. The code you are running is correct. The source code in the package has also been unit tested. At present, I cannot say which result is reliable. However, based on my limited knowledge and experience, I believe that the most important conclusion by using the null model should be related to how and why different groups change. It is meaningless to conclude whether the community in a system is deterministic or stochastic using a single method, because these are processes inferred based on methods and inherently have a high degree of uncertainty. The impact of methodological changes is significant. The effects of null model selection existed long before the incorporation of phylogenetic effects into null models, and many methods have been specifically studied to control the randomization effects of communities using different algorithms. On the other hand, I think it is unreasonable to compare tNST with betaNTI because they belong to different categories of methods, each with different algorithms for controlling null models. Personally, when it is difficult to conduct benchmarking tests, one should choose a relatively appropriate method based on the ecological issues to be explained and avoid hasty conclusions. Instead, one should speculate possible changes and their reasons based on the methods. Furthermore, null model methods can be partially validated using other categories of methods, such as constrained ordination, community function, differential taxa, etc. It is more meaningful to use other methods to determine why communities are subject to different selections, which environmental factors, community functions, and changes in important taxa lead to such selective changes. I think this point is very important. Comparing null model methods is difficult. This situation is somewhat similar to network methods. Different network constructions are based on different assumptions. The algorithms on which various methods rely differ greatly, making it difficult to directly compare the different networks generated under real conditions. These are just my personal views, for your reference.

DpennaS commented 7 months ago

Hi there! Thanks for the awesome answer and your personal thoughts. I agree with you that it is unreasonable to compare the methods, i did by curisosity to see the results and if they were matching. I will try to understand better the algoritms behind each method to try to reveal why the results are so different. I am working with natural environments withou controling environmental variables, so there are slight changes in some parameters, but not drastic ones. Therefore, I am trying to understand and characterize the assembly of those natural environments, sice no one did it yet for microbiota.

Thanks for the attention! Regards