Closed mondracek closed 1 month ago
Considering that the tests started failing without any change from our side, it looks like something changed in the Github test runner.
The tests fail at the second version of python that is going to be tested, no matter which version it happens to be.
If I recall @yakutovicha was saying before that running multiple simulations in a single process can cause failures because the state of the parameters persist between simulations. Maybe the test runner changed so that the different python versions somehow share resources or something?
Yeah, @yakutovicha , do you think this is related to the #232 issue?
I did some more testing, and it looks to me like it's actually specific versions of Python that are failing. 3.8, 3.9, and 3.10 seem to be affected. I was also able to replicate the segfault locally.
Not only different versions of Python, but different patch versions. For example, 3.9.18 works, but 3.9.19 does not, so I think this is why it started to fail suddenly.
This also has something to do with the parameter sharing between simulation runs, because the specific test that fails, only fails if run after another test, but not when run individually. This makes it annoying to try to debug.
Fixing #232 might fix the underlying problem, so probably the quickest way to deal with this for now would be to simply disable the one test that is failing. This makes the test pass: https://github.com/Probe-Particle/ppafm/actions/runs/9128561527/job/25101162232.
I can make the PR if this is okay?
@NikoOinonen, yes, please make the PR.
Reposted from #295 :
It looks to me, like a bug in GitHub or lowering down the policy, since all tests have died at 4 minutes marks. I did not changed any part of the "active code", only README and a commented part. Anyway the test stopped with the following error for python3.7:
Run PPAFM_RECOMPILE=1 pytest tests examples/PTCDA_Hartree_dz2 -v --cov --cov-report json
============================= test session starts ==============================
platform linux -- Python 3.7.17, pytest-7.4.4, pluggy-1.2.0 -- /opt/hostedtoolcache/Python/3.7.17/x64/bin/python
cachedir: .pytest_cache
rootdir: /home/runner/work/ppafm/ppafm
configfile: pyproject.toml
plugins: cov-4.1.0
collecting ... collected 12 items
tests/test_afmulator.py::test_afmulator_save_load PASSED [ 8%]
tests/test_atomicUtils.py::test_ZsToElems PASSED [ 16%]
tests/test_common.py::test_get_df_weight PASSED [ 25%]
tests/test_common.py::test_get_simple_df_weight PASSED [ 33%]
tests/test_common.py::test_sphere_tangent_space PASSED [ 41%]
tests/test_datagrid.py::test_power PASSED [ 50%]
tests/test_datagrid.py::test_tip_interp PASSED [ 58%]
tests/test_generator.py::test_GeneratorAFMtrainer PASSED [ 66%]
tests/test_io.py::test_xyz PASSED [ 75%]
tests/test_io.py::test_parse_comment_ase PASSED [ 83%]
tests/test_io.py::test_load_aims PASSED [ 91%]
Error: The operation was canceled.
The python 3.11 seems an interesting one:
Run PPAFM_RECOMPILE=1 pytest tests examples/PTCDA_Hartree_dz2 -v --cov --cov-report json
============================= test session starts ==============================
platform linux -- Python 3.11.9, pytest-8.3.2, pluggy-1.5.0 -- /opt/hostedtoolcache/Python/3.11.9/x64/bin/python
cachedir: .pytest_cache
rootdir: /home/runner/work/ppafm/ppafm
configfile: pyproject.toml
plugins: cov-5.0.0
collecting ... collected 12 items
tests/test_afmulator.py::test_afmulator_save_load PASSED [ 8%]
tests/test_atomicUtils.py::test_ZsToElems PASSED [ 16%]
tests/test_common.py::test_get_df_weight PASSED [ 25%]
tests/test_common.py::test_get_simple_df_weight PASSED [ 33%]
tests/test_common.py::test_sphere_tangent_space PASSED [ 41%]
tests/test_datagrid.py::test_power PASSED [ 50%]
tests/test_datagrid.py::test_tip_interp PASSED [ 58%]
tests/test_generator.py::test_GeneratorAFMtrainer PASSED [ 66%]
tests/test_io.py::test_xyz PASSED [ 75%]
tests/test_io.py::test_parse_comment_ase PASSED [ 83%]
tests/test_io.py::test_load_aims PASSED [ 91%]
/home/runner/work/_temp/a2a60b1b-2468-44ab-93c7-89e20eb061a1.sh: line 1: 1839 Segmentation fault (core dumped) PPAFM_RECOMPILE=1 pytest tests examples/PTCDA_Hartree_dz2 -v --cov --cov-report json
examples/PTCDA_Hartree_dz2/example_ptcda_hartree.py::example_ptcda_hartree
Error: Process completed with exit code 139.
Similarly 3.12 is having problems with the PTCDA_Hartree_dz2 example:
Run PPAFM_RECOMPILE=1 pytest tests examples/PTCDA_Hartree_dz2 -v --cov --cov-report json
============================= test session starts ==============================
platform linux -- Python 3.12.4, pytest-8.3.2, pluggy-1.5.0 -- /opt/hostedtoolcache/Python/3.12.4/x64/bin/python
cachedir: .pytest_cache
rootdir: /home/runner/work/ppafm/ppafm
configfile: pyproject.toml
plugins: cov-5.0.0
collecting ... collected 12 items
tests/test_afmulator.py::test_afmulator_save_load PASSED [ 8%]
tests/test_atomicUtils.py::test_ZsToElems PASSED [ 16%]
tests/test_common.py::test_get_df_weight PASSED [ 25%]
tests/test_common.py::test_get_simple_df_weight PASSED [ 33%]
tests/test_common.py::test_sphere_tangent_space PASSED [ 41%]
tests/test_datagrid.py::test_power PASSED [ 50%]
tests/test_datagrid.py::test_tip_interp PASSED [ 58%]
tests/test_generator.py::test_GeneratorAFMtrainer PASSED [ 66%]
tests/test_io.py::test_xyz PASSED [ 75%]
tests/test_io.py::test_parse_comment_ase PASSED [ 83%]
tests/test_io.py::test_load_aims PASSED [ 91%]
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_algorithms_reduce_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_algorithms_reduce_mako'. (couldnt-parse)
coverage._warn(msg, slug="couldnt-parse")
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_algorithms_scan_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_algorithms_scan_mako'. (couldnt-parse)
coverage._warn(msg, slug="couldnt-parse")
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_algorithms_transpose_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_algorithms_transpose_mako'. (couldnt-parse)
coverage._warn(msg, slug="couldnt-parse")
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_cluda_functions_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_cluda_functions_mako'. (couldnt-parse)
coverage._warn(msg, slug="couldnt-parse")
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_cluda_kernel_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_cluda_kernel_mako'. (couldnt-parse)
coverage._warn(msg, slug="couldnt-parse")
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_cluda_vsize_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_cluda_vsize_mako'. (couldnt-parse)
coverage._warn(msg, slug="couldnt-parse")
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_fft_fft_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_fft_fft_mako'. (couldnt-parse)
coverage._warn(msg, slug="couldnt-parse")
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_fft_fftshift_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_fft_fftshift_mako'. (couldnt-parse)
coverage._warn(msg, slug="couldnt-parse")
examples/PTCDA_Hartree_dz2/example_ptcda_hartree.py::example_ptcda_hartree PASSED [100%]
=============================== warnings summary ===============================
../../../../../opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/pytools/persistent_dict.py:59
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/pytools/persistent_dict.py:59: UserWarning: Unable to import recommended hash 'siphash24.siphash13', falling back to 'hashlib.sha256'. Run 'python3 -m pip install siphash24' to install the recommended hash.
warn("Unable to import recommended hash 'siphash24.siphash13', "
ppafm/cli/plot_results.py:12
/home/runner/work/ppafm/ppafm/ppafm/cli/plot_results.py:12: MatplotlibDeprecationWarning: Auto-close()ing of figures upon backend switching is deprecated since 3.8 and will be removed in 3.10. To suppress this warning, explicitly call plt.close('all') first.
mpl.use("Agg")
tests/test_afmulator.py::test_afmulator_save_load
tests/test_afmulator.py::test_afmulator_save_load
tests/test_afmulator.py::test_afmulator_save_load
examples/PTCDA_Hartree_dz2/example_ptcda_hartree.py::example_ptcda_hartree
examples/PTCDA_Hartree_dz2/example_ptcda_hartree.py::example_ptcda_hartree
/home/runner/work/ppafm/ppafm/ppafm/fieldFFT.py:232: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
return np.matrix(Lmat)
examples/PTCDA_Hartree_dz2/example_ptcda_hartree.py::example_ptcda_hartree
/home/runner/work/ppafm/ppafm/ppafm/fieldFFT.py:18: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
return np.matrix(lvec[1:])
examples/PTCDA_Hartree_dz2/example_ptcda_hartree.py::example_ptcda_hartree
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/numpy/matrixlib/defmatrix.py:70: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
return matrix(data, dtype=dtype, copy=False)
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
---------- coverage: platform linux, python 3.12.4-final-0 -----------
Coverage JSON written to file coverage.json
================== 12 passed, 9 warnings in 203.48s (0:03:23) ==================
FileRead program: reading Q-0.10K0.50/OutFz.xsf file
XYZ dimensions are 62 202 202
Reading DONE
Error: The operation was canceled.
I was testing this a bit just now, and got the problem to show up consistently on my local machine.
The following seems to cause a segfault every time:
export MPLBACKEND=AGG
export PPAFM_RECOMPILE=1
pytest -v \
tests/test_generator.py \
tests/human_eye/test_TipForce.py \
examples/PTCDA_Hartree_dz2
It is this specific combination that is not working. Removing any one of those tests makes it run without problem.
I think I found the source of this issue. I made a separate issue for the specific problem: #308.
The issue should be fixed now. The CI tests seem to be passing every time now.
I stumbled over this issue when trying to submit PR #278. This is essentially a copy of what I've reported there:
ci.yml
to test only a single version of python, the tests will pass.Run pytest
stage, the error message (at lines 31-33) is the following/home/runner/work/_temp/6ab19913-178e-4b8f-92bf-dae56f3545bf.sh: line 1: 1858 Segmentation fault (core dumped) PPAFM_RECOMPILE=1 pytest tests examples -v --cov --cov-report json
examples/PTCDA_single/example_ptcda.py::example_ptcda_single
Error: Process completed with exit code 139.
main
branch, so it is not an effect of any changes I tried to implement.I don't have much of an idea how to debug and fix these issues with the GitHub 'workflow' functionality, so I would appreciate any help from you guys who know more about it.