Closed antagomir closed 3 years ago
I am not sure about this, since I don't fully grasp the concept. Could you implement it and at the same time add a section to MiaBook
?
The overall motivation is that there has been some interest lately (in microbiome research) to carry out specific analyses at rare taxa as these are often overlooked by standard analyses. So, the concept is to focus on a particular subspace in the microbial community.
Yes we should be able to implement this as the time allows.
Is this solved?
No. But it is not urgent or very critical either. There are some interesting rarity indices (log_modulo_skewness) that could be migrated from microbiome package, and that could complement alpha diversity indices. But this is not necessary for Bioc submission. Is there another way to list non-urgent development ideas, rather than through issues?
I added this for the Bioc 3.14 milestone. Maybe this can help plan the implementation
@antagomir Would this be solved by the linked PR?
log modulo skewness is one kind of diversity measure, with a focus on rarity; the rarity
function is solved by adding log_modulo_skewness
as an option in estimateDiversity
in #102
The need for rare
and rare_members
is not solved by that.
Hi,
about this rare
and rare_members
thing.
Do we already have microbiome::rare_members
? I think getRareTaxa
does the job. It returns taxa whose abundance are under specific threshold. It's complement to getPrevalentTaxa
/microbiome::core_members
.
What we don't have are microbiome::rare
and microbiome::core
; functions that return a subset. So, I think those could be created also in getPrevalence.R. I think they could be done as following (subsetRareTaxa
, subsetPrevalentTaxa
?)
x <- agglomerateByRank(x)
a <- getRareTaxa(x)
x[a]
So I looked up, how rare
is implemented in microbiome
.
The subsetRareTaxa
name is a bit unclear for my taste. Are taxonomic values subset or are taxonomic information used for subsetting?
subsetToRareTaxa
and subsetToPrevalentTaxa
might a bit more clear. I would also put it in getPrevalence.R
I am ok with these.
The microbiome pkg has a set of functions to quantify "rarity" and subset the data to rare groups. The higher the rarity, the higher the diversity. These could be therefore added to estimateDiversity but conceptually they focus on rarity and are typically not included among standard diversity indices, in this sense they may deserve their own function.
The set of functions in microbiome pkg are as follows:
Function to identify rare taxa (complement to core taxa):
rare_members.R
Function to take subset of the phyloseq object that only includes rare taxa (complement to microbiome::core):
rare.R
Function to calculate rarity index (could be estimateRarity in mia?)
rarity.R
The helper functions for specific indices:
log_modulo_skewness.R
low_abundance.R
&rare_abundance.R