MoBiodiv / mobr

Tools for analyzing changes in diversity across scales
Other
23 stars 18 forks source link

towards more consistent computation of alpha, gamma, and beta diversity #267

Closed dmcglinn closed 6 months ago

dmcglinn commented 2 years ago

I'm trying to shore up the calculation of beta diversity across the various metrics we have proposed. In an effort to keep methods consistent across metrics and to keep practitioners from dwelling on beta diversity in isolation I was thinking of sunsetting the beta_C function and instead changing it to the function calc_S_C which will calculate richness for a given coverage level or if no coverage is supplied it will find an ideal coverage and then compute S. This function will then be used in the same standard flow other metrics undergo for calculating beta-diversity. 

To calculate beta_C what I'm proposing a user would do is:   calc_comm_div(my_comm, index = 'S_C', scales = 'beta') that function will make a call to:   calc_div which will make two calls (gamma and alpha) to:  calc_S_C which will ultimately use rarefaction to compute the S_n values at the appropriate coverage. 

The default behavior of calc_comm_div will be:   calc_comm_div(my_comm, index = 'S_C') which would return S_C at the alpha and gamma scales and beta_S_C as the ratio. 

The calc_comm_div function will also be able to call the confidence interval functions that I've been working on so it's the wrapper that pulls it all together for all functions. 

My questions for folks (pinging @T-Engel) are: 

1) does that sound reasonable? 2) are you ok with calc_S_C I thought it was slightly better than calc_SC or calc_Sc
3) is the target coverage function that you developed specifically for beta_C still make sense when thinking about comparing raw S values -> my thought is yes because its the largest coverage you can get across all samples. 

dmcglinn commented 2 years ago

Hey @T-Engel I'm picking this task back up. Any thoughts here?

Also I have some questions about two edge cases for the function invChat.

Specifically, when the sample size of the target coverage is calculated using invChat an error will result when there are no singletons in the community. See this line: https://github.com/MoBiodiv/mobr/blob/dev/R/beta_C.R#L98 If f1 == 0 then mm will become mathematically undefined. The Chao extrapolation method requires that there are singletons otherwise you are assumed to be at the asymptote and coverage is 1. So in this situation in which f1 == 0 I think we should just return the number of individuals in the sample.

From that same line of code you can see that mm will also be undefined if C == 1 (coverage is 1). Any suggestions for a better value of mm in this case? Thanks!

T-Engel commented 2 years ago

Hey Dan, thanks for pushing this forward. Regarding your first comment from June, in principle, I'm all for making the code more consistent. So go ahead, if you think it's better to have one function that does all the scales. However, note that betac is different from the other flavors of beta in one important aspect.

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: Dan McGlinn @.> Sent: Sunday, November 20, 2022 3:28:22 PM To: MoBiodiv/mobr @.> Cc: T-Engel @.>; Mention @.> Subject: [Extern] Re: [MoBiodiv/mobr] towards more consistent computation of alpha, gamma, and beta diversity (Issue #267)

Hey @T-Engelhttps://github.com/T-Engel I'm picking this task back up. Any thoughts here?

Also I have some questions about two edge cases for the function invChat.

Specifically, when the sample size of the target coverage is calculated using invChat an error will result when there are no singletons in the community. See this line: https://github.com/MoBiodiv/mobr/blob/dev/R/beta_C.R#L98 If f1 == 0 then mm will become mathematically undefined. The Chao extrapolation method requires that there are singletons otherwise you are assumed to be at the asymptote and coverage is 1. So in this situation in which f1 == 0 I think we should just return the number of individuals in the sample.

From that same line of code you can see that mm will also be undefined if C == 1 (coverage is 1). Any suggestions for a better value of mm in this case? Thanks!

— Reply to this email directly, view it on GitHubhttps://github.com/MoBiodiv/mobr/issues/267#issuecomment-1321147273, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJBEJHGUPTV3JJS4NV5IDZLWJIYQNANCNFSM5ZUVHXXQ. You are receiving this because you were mentioned.Message ID: @.***>

T-Engel commented 2 years ago

Sorry accidentally sent too early. So what i wanted to say is that for betas, betasn and betaspie, the beta scale is just the ratio of the respective gamma and alpha scales. But not for betac. So calc_s_c(scale=beta) should not equal to calc_s_c(scale=alpha)/calc_s_c(scale=gamma). Instead, the coverage based rarefaction only happens at the gamma scale, whereas the alpha scale uses individual based rarefaction (going straight down from the gamma IBR curve). Does that make sense to you? I also tried to clarify that in the manuscript recently. So from that perspective i find joining the 3 scales von s_c in a common function a bit misleading. But as long as the underlying calculation is correct, i don't mind.

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: @. @.> Sent: Tuesday, November 22, 2022 1:04:06 PM To: MoBiodiv/mobr @.>; MoBiodiv/mobr @.> Cc: Mention @.***> Subject: Re: [Extern] Re: [MoBiodiv/mobr] towards more consistent computation of alpha, gamma, and beta diversity (Issue #267)

Hey Dan, thanks for pushing this forward. Regarding your first comment from June, in principle, I'm all for making the code more consistent. So go ahead, if you think it's better to have one function that does all the scales. However, note that betac is different from the other flavors of beta in one important aspect.

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: Dan McGlinn @.> Sent: Sunday, November 20, 2022 3:28:22 PM To: MoBiodiv/mobr @.> Cc: T-Engel @.>; Mention @.> Subject: [Extern] Re: [MoBiodiv/mobr] towards more consistent computation of alpha, gamma, and beta diversity (Issue #267)

Hey @T-Engelhttps://github.com/T-Engel I'm picking this task back up. Any thoughts here?

Also I have some questions about two edge cases for the function invChat.

Specifically, when the sample size of the target coverage is calculated using invChat an error will result when there are no singletons in the community. See this line: https://github.com/MoBiodiv/mobr/blob/dev/R/beta_C.R#L98 If f1 == 0 then mm will become mathematically undefined. The Chao extrapolation method requires that there are singletons otherwise you are assumed to be at the asymptote and coverage is 1. So in this situation in which f1 == 0 I think we should just return the number of individuals in the sample.

From that same line of code you can see that mm will also be undefined if C == 1 (coverage is 1). Any suggestions for a better value of mm in this case? Thanks!

— Reply to this email directly, view it on GitHubhttps://github.com/MoBiodiv/mobr/issues/267#issuecomment-1321147273, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJBEJHGUPTV3JJS4NV5IDZLWJIYQNANCNFSM5ZUVHXXQ. You are receiving this because you were mentioned.Message ID: @.***>

sablowes commented 2 years ago

This is super helpful, Thore! The details of this calculation definitely needs to come through more clearly in the ms than the initial draft. The greater conceptual similarity to beta_S_n (visually at least), simplifies things tremendously for me. And I think it also helps clarify what the coverage standarisation at the gamma-scale is doing and its importance for comparing two metacommunities. It apparently takes me > 1 paper to understand things completely ;) Thanks!!

On 22 Nov 2022, at 13:16, T-Engel @.***> wrote:

Sorry accidentally sent too early. So what i wanted to say is that for betas, betasn and betaspie, the beta scale is just the ratio of the respective gamma and alpha scales. But not for betac. So calc_s_c(scale=beta) should not equal to calc_s_c(scale=alpha)/calc_s_c(scale=gamma). Instead, the coverage based rarefaction only happens at the gamma scale, whereas the alpha scale uses individual based rarefaction (going straight down from the gamma IBR curve). Does that make sense to you? I also tried to clarify that in the manuscript recently. So from that perspective i find joining the 3 scales von s_c in a common function a bit misleading. But as long as the underlying calculation is correct, i don't mind.

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: @. @.> Sent: Tuesday, November 22, 2022 1:04:06 PM To: MoBiodiv/mobr @.>; MoBiodiv/mobr @.> Cc: Mention @.***> Subject: Re: [Extern] Re: [MoBiodiv/mobr] towards more consistent computation of alpha, gamma, and beta diversity (Issue #267)

Hey Dan, thanks for pushing this forward. Regarding your first comment from June, in principle, I'm all for making the code more consistent. So go ahead, if you think it's better to have one function that does all the scales. However, note that betac is different from the other flavors of beta in one important aspect.

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: Dan McGlinn @.> Sent: Sunday, November 20, 2022 3:28:22 PM To: MoBiodiv/mobr @.> Cc: T-Engel @.>; Mention @.> Subject: [Extern] Re: [MoBiodiv/mobr] towards more consistent computation of alpha, gamma, and beta diversity (Issue #267)

Hey @T-Engelhttps://github.com/T-Engel I'm picking this task back up. Any thoughts here?

Also I have some questions about two edge cases for the function invChat.

Specifically, when the sample size of the target coverage is calculated using invChat an error will result when there are no singletons in the community. See this line: https://github.com/MoBiodiv/mobr/blob/dev/R/beta_C.R#L98 If f1 == 0 then mm will become mathematically undefined. The Chao extrapolation method requires that there are singletons otherwise you are assumed to be at the asymptote and coverage is 1. So in this situation in which f1 == 0 I think we should just return the number of individuals in the sample.

From that same line of code you can see that mm will also be undefined if C == 1 (coverage is 1). Any suggestions for a better value of mm in this case? Thanks!

— Reply to this email directly, view it on GitHubhttps://github.com/MoBiodiv/mobr/issues/267#issuecomment-1321147273, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJBEJHGUPTV3JJS4NV5IDZLWJIYQNANCNFSM5ZUVHXXQ. You are receiving this because you were mentioned.Message ID: @.***> — Reply to this email directly, view it on GitHub https://github.com/MoBiodiv/mobr/issues/267#issuecomment-1323573878, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEEEZP4LAVATNRX2QXCUTX3WJS2TJANCNFSM5ZUVHXXQ. You are receiving this because you are subscribed to this thread.

dmcglinn commented 6 months ago

I implemented the proposed changes in the dev branch.