biocore / gneiss

compositional data analysis toolbox
https://biocore.github.io/gneiss/
BSD 3-Clause "New" or "Revised" License
55 stars 28 forks source link

Sign change in ilr_transform results using new dependencies #270

Closed thermokarst closed 5 years ago

thermokarst commented 5 years ago

The test suite currently fails:

____________________________________________ TestILRTransform.test_ilr _____________________________________________
E       AssertionError: DataFrame.iloc[:, 1] are different
E
E       DataFrame.iloc[:, 1] values are different (33.33333 %)
E       [left]:  [-5.551115123125783e-17, 0.49012907173427367, 5.551115123125783e-17]
E       [right]: [-5.551115e-17, -0.4901291, 5.551115e-17]

with the following deps:

We are seeing a similar failure in the test suite for q2-gneiss.

Full test log ```bash pytest gneiss master 2bed3e3 =============================================== test session starts ================================================ platform darwin -- Python 3.5.5, pytest-3.8.1, py-1.7.0, pluggy-0.8.0 rootdir: /Users/matthew/Desktop/gneiss, inifile: collected 113 items gneiss/cluster/tests/test_pba.py ..... [ 4%] gneiss/composition/tests/test_composition.py F [ 5%] gneiss/composition/tests/test_variance.py .. [ 7%] gneiss/plot/tests/test_decompose.py ....... [ 13%] gneiss/plot/tests/test_dendrogram.py ........ [ 20%] gneiss/plot/tests/test_heatmap.py .... [ 23%] gneiss/plot/tests/test_radial.py . [ 24%] gneiss/plot/tests/test_regression_plot.py .. [ 26%] gneiss/regression/tests/test_mixedlm.py ... [ 29%] gneiss/regression/tests/test_model.py ..... [ 33%] gneiss/regression/tests/test_ols.py ............. [ 45%] gneiss/tests/test_balances.py ............. [ 56%] gneiss/tests/test_model.py . [ 57%] gneiss/tests/test_sort.py ................ [ 71%] gneiss/tests/test_util.py ................................ [100%] ===================================================== FAILURES ===================================================== ____________________________________________ TestILRTransform.test_ilr _____________________________________________ self = def test_ilr(self): np.random.seed(0) table = pd.DataFrame([[1, 1, 2, 2], [1, 2, 2, 1], [2, 2, 1, 1]], index=[1, 2, 3], columns=['a', 'b', 'c', 'd']) table = table.reindex(columns=np.random.permutation(table.columns)) ph = pd.Series([1, 2, 3], index=table.index) tree = gradient_linkage(table, ph) res_balances = ilr_transform(table, tree) exp_balances = pd.DataFrame( [[0.693147, -5.551115e-17, 2.775558e-17], [0.000000, -4.901291e-01, -4.901291e-01], [-0.693147, 5.551115e-17, -2.775558e-17]], columns=['y0', 'y1', 'y2'], index=[1, 2, 3]) > pdt.assert_frame_equal(res_balances, exp_balances) gneiss/composition/tests/test_composition.py:35: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../../.conda/envs/gneiss-debug/lib/python3.5/site-packages/pandas/util/testing.py:1365: in assert_frame_equal obj='DataFrame.iloc[:, {idx}]'.format(idx=i)) ../../.conda/envs/gneiss-debug/lib/python3.5/site-packages/pandas/util/testing.py:1244: in assert_series_equal obj='{obj}'.format(obj=obj)) pandas/_libs/testing.pyx:59: in pandas._libs.testing.assert_almost_equal ??? pandas/_libs/testing.pyx:173: in pandas._libs.testing.assert_almost_equal ??? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ obj = 'DataFrame.iloc[:, 1]', message = 'DataFrame.iloc[:, 1] values are different (33.33333 %)' left = '[-5.551115123125783e-17, 0.49012907173427367, 5.551115123125783e-17]' right = '[-5.551115e-17, -0.4901291, 5.551115e-17]', diff = None def raise_assert_detail(obj, message, left, right, diff=None): if isinstance(left, np.ndarray): left = pprint_thing(left) elif is_categorical_dtype(left): left = repr(left) if PY2 and isinstance(left, string_types): # left needs to be printable in native text type in python2 left = left.encode('utf-8') if isinstance(right, np.ndarray): right = pprint_thing(right) elif is_categorical_dtype(right): right = repr(right) if PY2 and isinstance(right, string_types): # right needs to be printable in native text type in python2 right = right.encode('utf-8') msg = """{obj} are different {message} [left]: {left} [right]: {right}""".format(obj=obj, message=message, left=left, right=right) if diff is not None: msg += "\n[diff]: {diff}".format(diff=diff) > raise AssertionError(msg) E AssertionError: DataFrame.iloc[:, 1] are different E E DataFrame.iloc[:, 1] values are different (33.33333 %) E [left]: [-5.551115123125783e-17, 0.49012907173427367, 5.551115123125783e-17] E [right]: [-5.551115e-17, -0.4901291, 5.551115e-17] ../../.conda/envs/gneiss-debug/lib/python3.5/site-packages/pandas/util/testing.py:1035: AssertionError ================================================= warnings summary ================================================= /Users/matthew/Desktop/gneiss/gneiss/util.py:245: FutureWarning: '.reindex_axis' is deprecated and will be removed in a future version. Use '.reindex' instead. _table = _table.reindex_axis(sorted_features, axis=1) /Users/matthew/.conda/envs/gneiss-debug/lib/python3.5/site-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result. return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval /Users/matthew/.conda/envs/gneiss-debug/lib/python3.5/site-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result. return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval /Users/matthew/.conda/envs/gneiss-debug/lib/python3.5/site-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result. return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval /Users/matthew/.conda/envs/gneiss-debug/lib/python3.5/site-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result. return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval /Users/matthew/Desktop/gneiss/gneiss/plot/tests/test_dendrogram.py:36: DeprecationWarning: Please use assertEqual instead. self.assertEquals(t.leafcount, 4) /Users/matthew/Desktop/gneiss/gneiss/plot/tests/test_dendrogram.py:37: DeprecationWarning: Please use assertEqual instead. self.assertEquals(t.children[0].leafcount, 2) /Users/matthew/Desktop/gneiss/gneiss/plot/tests/test_dendrogram.py:38: DeprecationWarning: Please use assertEqual instead. self.assertEquals(t.children[1].leafcount, 2) /Users/matthew/Desktop/gneiss/gneiss/plot/tests/test_dendrogram.py:39: DeprecationWarning: Please use assertEqual instead. self.assertEquals(t.children[0].children[0].leafcount, 1) /Users/matthew/Desktop/gneiss/gneiss/plot/tests/test_dendrogram.py:40: DeprecationWarning: Please use assertEqual instead. self.assertEquals(t.children[0].children[1].leafcount, 1) /Users/matthew/Desktop/gneiss/gneiss/plot/tests/test_dendrogram.py:41: DeprecationWarning: Please use assertEqual instead. self.assertEquals(t.children[1].children[0].leafcount, 1) /Users/matthew/Desktop/gneiss/gneiss/plot/tests/test_dendrogram.py:42: DeprecationWarning: Please use assertEqual instead. self.assertEquals(t.children[1].children[1].leafcount, 1) /Users/matthew/Desktop/gneiss/gneiss/util.py:245: FutureWarning: '.reindex_axis' is deprecated and will be removed in a future version. Use '.reindex' instead. _table = _table.reindex_axis(sorted_features, axis=1) /Users/matthew/Desktop/gneiss/gneiss/util.py:245: FutureWarning: '.reindex_axis' is deprecated and will be removed in a future version. Use '.reindex' instead. _table = _table.reindex_axis(sorted_features, axis=1) /Users/matthew/Desktop/gneiss/gneiss/util.py:245: FutureWarning: '.reindex_axis' is deprecated and will be removed in a future version. Use '.reindex' instead. _table = _table.reindex_axis(sorted_features, axis=1) /Users/matthew/Desktop/gneiss/gneiss/tests/test_util.py:338: FutureWarning: '.reindex_axis' is deprecated and will be removed in a future version. Use '.reindex' instead. exp_df = exp_df.reindex_axis(sorted(exp_df.columns), axis=1) /Users/matthew/Desktop/gneiss/gneiss/tests/test_util.py:339: FutureWarning: '.reindex_axis' is deprecated and will be removed in a future version. Use '.reindex' instead. res_df = res_df.reindex_axis(sorted(res_df.columns), axis=1) /Users/matthew/Desktop/gneiss/gneiss/tests/test_util.py:343: FutureWarning: '.reindex_axis' is deprecated and will be removed in a future version. Use '.reindex' instead. exp_md = exp_md.reindex_axis(sorted(exp_md.index), axis=0) /Users/matthew/Desktop/gneiss/gneiss/tests/test_util.py:344: FutureWarning: '.reindex_axis' is deprecated and will be removed in a future version. Use '.reindex' instead. res_md = res_md.reindex_axis(sorted(res_md.index), axis=0) /Users/matthew/Desktop/gneiss/gneiss/util.py:245: FutureWarning: '.reindex_axis' is deprecated and will be removed in a future version. Use '.reindex' instead. _table = _table.reindex_axis(sorted_features, axis=1) /Users/matthew/Desktop/gneiss/gneiss/util.py:245: FutureWarning: '.reindex_axis' is deprecated and will be removed in a future version. Use '.reindex' instead. _table = _table.reindex_axis(sorted_features, axis=1) /Users/matthew/Desktop/gneiss/gneiss/util.py:245: FutureWarning: '.reindex_axis' is deprecated and will be removed in a future version. Use '.reindex' instead. _table = _table.reindex_axis(sorted_features, axis=1) /Users/matthew/Desktop/gneiss/gneiss/util.py:245: FutureWarning: '.reindex_axis' is deprecated and will be removed in a future version. Use '.reindex' instead. _table = _table.reindex_axis(sorted_features, axis=1) /Users/matthew/Desktop/gneiss/gneiss/util.py:245: FutureWarning: '.reindex_axis' is deprecated and will be removed in a future version. Use '.reindex' instead. _table = _table.reindex_axis(sorted_features, axis=1) /Users/matthew/Desktop/gneiss/gneiss/util.py:245: FutureWarning: '.reindex_axis' is deprecated and will be removed in a future version. Use '.reindex' instead. _table = _table.reindex_axis(sorted_features, axis=1) /Users/matthew/Desktop/gneiss/gneiss/util.py:339: UserWarning: Warning. Internal node (y2) has been replaced with (y2) "with (%s)" % (n.name, label), UserWarning) /Users/matthew/Desktop/gneiss/gneiss/util.py:339: UserWarning: Warning. Internal node (y2) has been replaced with (y2) "with (%s)" % (n.name, label), UserWarning) /Users/matthew/Desktop/gneiss/gneiss/util.py:339: UserWarning: Warning. Internal node (r) has been replaced with (r) "with (%s)" % (n.name, label), UserWarning) -- Docs: https://docs.pytest.org/en/latest/warnings.html ================================ 1 failed, 112 passed, 28 warnings in 11.77 seconds ================================ ```
mortonjt commented 5 years ago

🤦‍♂️ that's not good. Thank you for raising this up - the sign change is likely to do with a change of basis. Will take a look tonight.

mortonjt commented 5 years ago

Thanks for catching @thermokarst . Just raised PR under #272 to fix this test. I've simplified the test so that the tree is fixed in the test case rather than being autogenerated. This will guarantee that the test case won't change in the future due to tree sorts.

thermokarst commented 5 years ago

Fixed in #273.