Open mikemc opened 4 years ago
It looks like the reason that only the first case results in dummy sample names is that phyloseq checks if the row names are as.character(1:n)
, and if so decides that sample names are missing and sets the names to "sa1", "sa2", etc.
This also pops up when subsetting, e.g.
sam <- tibble::tibble(
sample_id = c(letters[1:3], 1:3),
var = c(rep("a", 3), rep("b", 3)),
) %>% sample_data
sam
#> var
#> a a
#> b a
#> c a
#> 1 b
#> 2 b
#> 3 b
sam %>% subset_samples(var == "b")
#> var
#> sa1 b
#> sa2 b
#> sa3 b
There seems to be inconsistent handling of sample names by various phyloseq methods. For example,
In addition, some phyloseq functions cause numerical sample names to be prepended with an "X", as would be done by
make.names()
. This happens in the results ofdiversity()
.