microbiome / mia

Microbiome analysis
https://microbiome.github.io/mia/
Artistic License 2.0
46 stars 27 forks source link

perSampleDominantFeatures works only with taxonomyRanks #415

Open antagomir opened 1 year ago

antagomir commented 1 year ago

The perSampleDominantFeatures and addPerSampleDominantFeatures

This works:

library(mia)
data(GlobalPatterns, package="mia")
tse <- GlobalPatterns
rowData(tse)$group <- as.character(sample.int(5, nrow(tse), replace = TRUE))
perSampleDominantFeatures(tse, rank="Class")

This gives error:

perSampleDominantFeatures(tse, rank="group")

Error: 'rank' must be a value from 'taxonomyRanks()'

We recently implemented the mergeFeatures functions so that they also allow other fields than ranks. Could we enable it here as well?

antagomir commented 1 year ago

A related issue for addPerSampleDominantFeatures is that it adds the new field in colData with the sample names. The sample names are unnecessary (since it is clear from the context in which sample each of them belongs to) and may disturb downstream analyses.

However, this can be justified when there are samples with multiple features that are all dominant. Then this is a list, where each (sample) element may have multiple dominant features.

Consider adding an argument that picks a single dominant feature at random in such cases. Not necessarily a good idea but to discuss.

antagomir commented 1 month ago

@Daenarys8 could you check if this is readY?