biocore / gneiss

compositional data analysis toolbox
https://biocore.github.io/gneiss/
BSD 3-Clause "New" or "Revised" License
55 stars 28 forks source link

BUG: match appears to be broken with biom tables #290

Closed mortonjt closed 3 years ago

mortonjt commented 3 years ago

Not sure exactly what is going on here, but it looks like the the sample ids don't align with biom tables.

You can see this with the intermittent hypoxia dataset : https://qiita.ucsd.edu/study/description/10422 Below is some example to reproduce behavior (file paths do need to be changed though).

t_md = pd.read_table('test_metadata.txt', index_col=0)
t_biom = biom.load_table('test.biom')
t_biom, t_md = match(t_biom, t_md)
print(t_biom.ids())
print(t_md.index)
['10422.16.F.3', '10422.17.F.12', '10422.28.F.1', '10422.31.F.9', ...
['10422.25.F.3', '10422.23.F.4', '10422.30.F.2', '10422.12.F.14', ...

You can see that the ids are not matched