Closed tleonardi closed 5 years ago
The txComp test randomly fails with python 3.5 due to a p-value discrepancy:
============================= test session starts ============================== platform linux -- Python 3.5.6, pytest-4.4.0, py-1.7.0, pluggy-0.9.0 rootdir: /home/travis/build/tleonardi/nanocompore collected 45 items tests/test_Integration.py .sssssssssssssssssss. [ 46%] tests/test_SampCompDB.py .... [ 55%] tests/test_TxComp.py ...........F.... [ 91%] tests/test_Whitelist.py .... [100%] =================================== FAILURES =================================== ____________________________ test_txComp_GMM_anova _____________________________ test_ref_pos_list = ([{'data': {'KD': {'KD1': {'coverage': 100, 'dwell': array([121.70757477, 129.70100881, 111.67411923, 131.1979314 , ...328765287e-39, 3.3968653938213694e-40, 1.9321679678623975e-36, 8.482777798353687e-40, nan, 7.06503867181238e-40, ...]}) def test_txComp_GMM_anova(test_ref_pos_list): ml = mock.Mock() if sys.version_info < (3, 6): tol = 0.0002 else: tol=0.00000001 res = txCompare(test_ref_pos_list[0], methods=['GMM'], logit=False, sequence_context=2, min_coverage=3, logger=ml, allow_warnings=False, random_state=np.random.RandomState(seed=42)) GMM_pvalues = [pos['txComp']['GMM_anova_pvalue'] for pos in res ] > assert GMM_pvalues == [pytest.approx(i, abs=tol, nan_ok=True) for i in test_ref_pos_list[1]['GMM_anova']] E assert [0.0008574768...76768562, ...] == [0.00085747684... 2.0e-04, ...] E At index 3 diff: 0.0017335646468135102 != 0.0010906844025473576 ± 2.0e-04 E Use -v to get the full diff tests/test_TxComp.py:104: AssertionError ============== 1 failed, 25 passed, 19 skipped in 187.78 seconds =============== The command "pytest" exited with 1.
After inspecting the results of the GMM fitting, it looks like when the test fails there's a small cluster counts discrepancy:
root@3f23411890db:/nanocompore/tests# pytest test_TxComp.py -vs =================================================================================================================================== test session starts ==================================================================================================================================== platform linux -- Python 3.5.7, pytest-4.4.0, py-1.8.0, pluggy-0.9.0 -- /usr/local/bin/python cachedir: .pytest_cache rootdir: /nanocompore collected 16 items test_TxComp.py::test_combine_pvalues_hou[pvalues0] PASSED test_TxComp.py::test_combine_pvalues_hou[pvalues1] PASSED test_TxComp.py::test_combine_pvalues_hou[pvalues2] PASSED test_TxComp.py::test_combine_pvalues_hou[pvalues3] PASSED test_TxComp.py::test_combine_pvalues_raises_exception_with_invalid_pvalues[pvalues0] PASSED test_TxComp.py::test_combine_pvalues_raises_exception_with_invalid_pvalues[pvalues1] PASSED test_TxComp.py::test_combine_pvalues_raises_exception_with_invalid_pvalues[pvalues2] PASSED test_TxComp.py::test_nonparametric_test[v10-v20-expected0] PASSED test_TxComp.py::test_sum_of_squares[x0-1243] PASSED test_TxComp.py::test_sum_of_squares[2-4] PASSED test_TxComp.py::test_sum_of_squares[x2-201] PASSED test_TxComp.py::test_txComp_GMM_anova {'GMM_anova_model': {'delta_logit': -5.154130973499999, 'table': F_onewayResult(statistic=915.3557790180453, pvalue=0.0010906844025473576), 'pvalue': 0.0010906844025473576, 'log_ratios': array([ 2.77258872, -2.60796674, -2.46385324, 2.46385324]) }, 'shift_stats': OrderedDict([('c1_mean_intensity', 101.25371018948026), ('c2_mean_intensity', 120.1690563433108), ('c1_median_intensity', 101.27532194257512), ('c2_median_intensity', 120.70768400991767), ('c1_sd_intensity', 9.976570066976405), ('c2_sd_intensity', 9.138834592954336) , ('c1_mean_dwell', 99.53689876631836), ('c2_mean_dwell', 121.33193242927064), ('c1_median_dwell', 99.33218082115548), ('c2_median_dwell', 121.12071127421946), ('c1_sd_dwell', 10.25957982066471), ('c2_sd_dwell', 9.296922575493252)]), 'GMM_model': {'cluster_counts': 'KD1:95/5__WT1:6/9 4__WT2:7/93__KD2:93/7', 'model': GaussianMixture(covariance_type='full', init_params='kmeans', max_iter=1000, means_init=None, n_components=2, n_init=1, precisions_init=None, random_state=<mtrand.RandomState object at 0x7f47adf82438>, reg_covar=1e-06, tol=0.001, verbose=0, verbose_interval=10, warm_start=False, weights_init=None)}, 'GMM_anova_pvalue_context_2': 4.706324386384477e-14, 'GMM_anova_pvalue': 0.0010906844025473576} PASSED test_TxComp.py::test_txComp_GMM_logit [1.2742453287653416e-39, 3.3968653938213694e-40, 1.9321679678623975e-36, 6.01195032712085e-40, nan, 7.06503867181238e-40, 1.8392720921153275e-40, 9.162002495725215e-32, 5.92288489163853e-34, 3.1972432623453856e-40] [1.274245328765287e-39, 3.3968653938213694e-40, 1.9321679678623975e-36, 8.482777798353687e-40, nan, 7.06503867181238e-40, 1.839272092115274e-40, 9.162002495725215e-32, 5.922884891638699e-34, 3.1972432623454785e-40] ['KD1:94/6__WT1:7/93__WT2:9/91__KD2:95/5', 'KD1:92/8__WT1:12/88__WT2:7/93__KD2:93/7', 'KD1:82/18__WT1:5/95__WT2:10/90__KD2:85/15', 'KD1:95/5__WT1:6/94__WT2:7/93__KD2:93/7', 'NC', 'KD1:9/91__WT1:96/4__WT2:92/8__KD2:8/92', 'KD1:8/92__WT1:95/5__WT2:90/10__KD2:8/92', 'KD1:15/85__WT1:96/4 __WT2:98/2__KD2:14/86', 'KD1:3/97__WT1:86/14__WT2:84/16__KD2:5/95', 'KD1:7/93__WT1:94/6__WT2:93/7__KD2:7/93'] PASSED test_TxComp.py::test_txComp_GMM_anova_0_var PASSED test_TxComp.py::test_txComp_GMM_dup_lab PASSED test_TxComp.py::test_txComp_lowCov PASSED ================================================================================================================================ 16 passed in 1.13 seconds ================================================================================================================================$
root@3f23411890db:/nanocompore/tests# pytest test_TxComp.py -vs =================================================================================================================================== test session starts ==================================================================================================================================== platform linux -- Python 3.5.7, pytest-4.4.0, py-1.8.0, pluggy-0.9.0 -- /usr/local/bin/python cachedir: .pytest_cache rootdir: /nanocompore collected 16 items test_TxComp.py::test_combine_pvalues_hou[pvalues0] PASSED test_TxComp.py::test_combine_pvalues_hou[pvalues1] PASSED test_TxComp.py::test_combine_pvalues_hou[pvalues2] PASSED test_TxComp.py::test_combine_pvalues_hou[pvalues3] PASSED test_TxComp.py::test_combine_pvalues_raises_exception_with_invalid_pvalues[pvalues0] PASSED test_TxComp.py::test_combine_pvalues_raises_exception_with_invalid_pvalues[pvalues1] PASSED test_TxComp.py::test_combine_pvalues_raises_exception_with_invalid_pvalues[pvalues2] PASSED test_TxComp.py::test_nonparametric_test[v10-v20-expected0] PASSED test_TxComp.py::test_sum_of_squares[x0-1243] PASSED test_TxComp.py::test_sum_of_squares[2-4] PASSED test_TxComp.py::test_sum_of_squares[x2-201] PASSED test_TxComp.py::test_txComp_GMM_anova {'GMM_anova_pvalue': 0.0017335646468135102, 'GMM_anova_model': {'log_ratios': array([ 2.77258872, -2.46385324, 2.46385324, -2.77258872]), 'table': F_onewayResult(statistic=575.3465305297385, pvalue=0.0017335646468135102), 'pvalue': 0.00173356464 68135102, 'delta_logit': -5.236441963}, 'GMM_model': {'cluster_counts': 'KD1:95/5__WT2:7/93__KD2:93/7__WT1:5/95', 'model': GaussianMixture(covariance_type='full', init_params='kmeans', max_iter=1000, means_init=None, n_components=2, n_init=1, precisions_init=None, random_state=<mtrand.RandomState object at 0x7fd59152e708>, reg_covar=1e-06, tol=0.001, verbose=0, verbose_interval=10, warm_start=False, weights_init=None)}, 'GMM_anova_pvalue_context_2': 1.2760955364845727e-13, 'shift_stats': OrderedDict([('c1_mean_intensity', 101.25371018948026), ('c2_mean_intensity', 120.16905634331079), ('c1_median_intensity', 101.27532194257512), ('c2_median_intensity', 120.70768400991767), ('c1_sd_intensity', 9.976570066976405), ('c2_sd_intensity', 9.138834592954336), ('c1_mean_dwell', 99.53689876631836), ('c2_mean_dwell', 121.33193242927064), ('c1_median_dwell', 99.33218082115548), ('c2_median_dwell', 121.12071127421946), ('c1_sd_dwell', 10.259579 82066471), ('c2_sd_dwell', 9.296922575493252)])} FAILED test_TxComp.py::test_txComp_GMM_logit [1.2742453287653416e-39, 3.396865393821225e-40, 1.9321679678623975e-36, 8.482777798354296e-40, nan, 7.0650386718125795e-40, 1.8392720921153275e-40, 4.6826664356268694e-32, 5.922884891638699e-34, 3.197243262345706e-40] [1.274245328765287e-39, 3.3968653938213694e-40, 1.9321679678623975e-36, 8.482777798353687e-40, nan, 7.06503867181238e-40, 1.839272092115274e-40, 9.162002495725215e-32, 5.922884891638699e-34, 3.1972432623454785e-40] ['KD1:94/6__WT2:9/91__KD2:95/5__WT1:7/93', 'KD1:92/8__WT2:7/93__KD2:93/7__WT1:12/88', 'KD1:82/18__WT2:10/90__KD2:85/15__WT1:5/95', 'KD1:95/5__WT2:7/93__KD2:93/7__WT1:5/95', 'NC', 'KD1:9/91__WT2:92/8__KD2:8/92__WT1:96/4', 'KD1:8/92__WT2:90/10__KD2:8/92__WT1:95/5', 'KD1:14/86__WT2:98/2 __KD2:14/86__WT1:96/4', 'KD1:3/97__WT2:84/16__KD2:5/95__WT1:86/14', 'KD1:7/93__WT2:93/7__KD2:7/93__WT1:94/6'] PASSED test_TxComp.py::test_txComp_GMM_anova_0_var PASSED test_TxComp.py::test_txComp_GMM_dup_lab PASSED test_TxComp.py::test_txComp_lowCov PASSED ========================================================================================================================================= FAILURES ========================================================================================================================================= __________________________________________________________________________________________________________________________________ test_txComp_GMM_anova ___________________________________________________________________________________________________________________________________ test_ref_pos_list = ([{'data': {'KD': {'KD1': {'coverage': 100, 'dwell': array([121.70757477, 129.70100881, 111.67411923, 131.1979314 , ...328765287e-39, 3.3968653938213694e-40, 1.9321679678623975e-36, 8.482777798353687e-40, nan, 7.06503867181238e-40, ...]}) def test_txComp_GMM_anova(test_ref_pos_list): ml = mock.Mock() if sys.version_info < (3, 6): tol = 0.0002 else: tol=0.00000001 res = txCompare(test_ref_pos_list[0], methods=['GMM'], logit=False, sequence_context=2, min_coverage=3, logger=ml, allow_warnings=False, random_state=np.random.RandomState(seed=42)) GMM_pvalues = [pos['txComp']['GMM_anova_pvalue'] for pos in res ] print(res[3]['txComp']) > assert GMM_pvalues == [pytest.approx(i, abs=tol, nan_ok=True) for i in test_ref_pos_list[1]['GMM_anova']] E AssertionError: assert [0.0008574768...76768562, ...] == [0.00085747684... 2.0e-04, ...] E At index 3 diff: 0.0017335646468135102 != 0.0010906844025473576 ± 2.0e-04 E Full diff: E - [0.0008574768473501677, E + [0.0008574768473501677 ± 2.0e-04, E ? ++++++++++ E - 0.0036329291397528157, E + 0.0036329291397528157 ± 2.0e-04,... E E ...Full output truncated (24 lines hidden), use '-vv' to show test_TxComp.py:105: AssertionError =========================================================================================================================== 1 failed, 15 passed in 1.34 seconds ============================================================================================================================
As a temporary workaround I'm lowering the tol for python 3.5
At the moment it would be advisable to use Python3.6+ instead
The txComp test randomly fails with python 3.5 due to a p-value discrepancy:
After inspecting the results of the GMM fitting, it looks like when the test fails there's a small cluster counts discrepancy:
Passing
Failing