Probe-Particle / ppafm

Classical force field model for simulating atomic force microscopy images.
MIT License
49 stars 18 forks source link

Continuous-integration tests keep failing, blocking any new pull request #279

Open mondracek opened 3 months ago

mondracek commented 3 months ago

I stumbled over this issue when trying to submit PR #278. This is essentially a copy of what I've reported there:

I don't have much of an idea how to debug and fix these issues with the GitHub 'workflow' functionality, so I would appreciate any help from you guys who know more about it.

NikoOinonen commented 3 months ago

Considering that the tests started failing without any change from our side, it looks like something changed in the Github test runner.

The tests fail at the second version of python that is going to be tested, no matter which version it happens to be.

If I recall @yakutovicha was saying before that running multiple simulations in a single process can cause failures because the state of the parameters persist between simulations. Maybe the test runner changed so that the different python versions somehow share resources or something?

mondracek commented 3 months ago

Yeah, @yakutovicha , do you think this is related to the #232 issue?

NikoOinonen commented 3 months ago

I did some more testing, and it looks to me like it's actually specific versions of Python that are failing. 3.8, 3.9, and 3.10 seem to be affected. I was also able to replicate the segfault locally.

NikoOinonen commented 3 months ago

Not only different versions of Python, but different patch versions. For example, 3.9.18 works, but 3.9.19 does not, so I think this is why it started to fail suddenly.

This also has something to do with the parameter sharing between simulation runs, because the specific test that fails, only fails if run after another test, but not when run individually. This makes it annoying to try to debug.

NikoOinonen commented 3 months ago

Fixing #232 might fix the underlying problem, so probably the quickest way to deal with this for now would be to simply disable the one test that is failing. This makes the test pass: https://github.com/Probe-Particle/ppafm/actions/runs/9128561527/job/25101162232.

I can make the PR if this is okay?

mondracek commented 3 months ago

@NikoOinonen, yes, please make the PR.

ondrejkrejci commented 1 week ago

Reposted from #295 :

It looks to me, like a bug in GitHub or lowering down the policy, since all tests have died at 4 minutes marks. I did not changed any part of the "active code", only README and a commented part. Anyway the test stopped with the following error for python3.7:

Run PPAFM_RECOMPILE=1 pytest tests examples/PTCDA_Hartree_dz2 -v --cov --cov-report json
============================= test session starts ==============================
platform linux -- Python 3.7.17, pytest-7.4.4, pluggy-1.2.0 -- /opt/hostedtoolcache/Python/3.7.17/x64/bin/python
cachedir: .pytest_cache
rootdir: /home/runner/work/ppafm/ppafm
configfile: pyproject.toml
plugins: cov-4.1.0
collecting ... collected 12 items

tests/test_afmulator.py::test_afmulator_save_load PASSED                 [  8%]
tests/test_atomicUtils.py::test_ZsToElems PASSED                         [ 16%]
tests/test_common.py::test_get_df_weight PASSED                          [ 25%]
tests/test_common.py::test_get_simple_df_weight PASSED                   [ 33%]
tests/test_common.py::test_sphere_tangent_space PASSED                   [ 41%]
tests/test_datagrid.py::test_power PASSED                                [ 50%]
tests/test_datagrid.py::test_tip_interp PASSED                           [ 58%]
tests/test_generator.py::test_GeneratorAFMtrainer PASSED                 [ 66%]
tests/test_io.py::test_xyz PASSED                                        [ 75%]
tests/test_io.py::test_parse_comment_ase PASSED                          [ 83%]
tests/test_io.py::test_load_aims PASSED                                  [ 91%]
Error: The operation was canceled.

The python 3.11 seems an interesting one:

Run PPAFM_RECOMPILE=1 pytest tests examples/PTCDA_Hartree_dz2 -v --cov --cov-report json
============================= test session starts ==============================
platform linux -- Python 3.11.9, pytest-8.3.2, pluggy-1.5.0 -- /opt/hostedtoolcache/Python/3.11.9/x64/bin/python
cachedir: .pytest_cache
rootdir: /home/runner/work/ppafm/ppafm
configfile: pyproject.toml
plugins: cov-5.0.0
collecting ... collected 12 items

tests/test_afmulator.py::test_afmulator_save_load PASSED                 [  8%]
tests/test_atomicUtils.py::test_ZsToElems PASSED                         [ 16%]
tests/test_common.py::test_get_df_weight PASSED                          [ 25%]
tests/test_common.py::test_get_simple_df_weight PASSED                   [ 33%]
tests/test_common.py::test_sphere_tangent_space PASSED                   [ 41%]
tests/test_datagrid.py::test_power PASSED                                [ 50%]
tests/test_datagrid.py::test_tip_interp PASSED                           [ 58%]
tests/test_generator.py::test_GeneratorAFMtrainer PASSED                 [ 66%]
tests/test_io.py::test_xyz PASSED                                        [ 75%]
tests/test_io.py::test_parse_comment_ase PASSED                          [ 83%]
tests/test_io.py::test_load_aims PASSED                                  [ 91%]
/home/runner/work/_temp/a2a60b1b-2468-44ab-93c7-89e20eb061a1.sh: line 1:  1839 Segmentation fault      (core dumped) PPAFM_RECOMPILE=1 pytest tests examples/PTCDA_Hartree_dz2 -v --cov --cov-report json
examples/PTCDA_Hartree_dz2/example_ptcda_hartree.py::example_ptcda_hartree 
Error: Process completed with exit code 139.

Similarly 3.12 is having problems with the PTCDA_Hartree_dz2 example:

Run PPAFM_RECOMPILE=1 pytest tests examples/PTCDA_Hartree_dz2 -v --cov --cov-report json
============================= test session starts ==============================
platform linux -- Python 3.12.4, pytest-8.3.2, pluggy-1.5.0 -- /opt/hostedtoolcache/Python/3.12.4/x64/bin/python
cachedir: .pytest_cache
rootdir: /home/runner/work/ppafm/ppafm
configfile: pyproject.toml
plugins: cov-5.0.0
collecting ... collected 12 items

tests/test_afmulator.py::test_afmulator_save_load PASSED                 [  8%]
tests/test_atomicUtils.py::test_ZsToElems PASSED                         [ 16%]
tests/test_common.py::test_get_df_weight PASSED                          [ 25%]
tests/test_common.py::test_get_simple_df_weight PASSED                   [ 33%]
tests/test_common.py::test_sphere_tangent_space PASSED                   [ 41%]
tests/test_datagrid.py::test_power PASSED                                [ 50%]
tests/test_datagrid.py::test_tip_interp PASSED                           [ 58%]
tests/test_generator.py::test_GeneratorAFMtrainer PASSED                 [ 66%]
tests/test_io.py::test_xyz PASSED                                        [ 75%]
tests/test_io.py::test_parse_comment_ase PASSED                          [ 83%]
tests/test_io.py::test_load_aims PASSED                                  [ 91%]
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_algorithms_reduce_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_algorithms_reduce_mako'. (couldnt-parse)
  coverage._warn(msg, slug="couldnt-parse")
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_algorithms_scan_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_algorithms_scan_mako'. (couldnt-parse)
  coverage._warn(msg, slug="couldnt-parse")
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_algorithms_transpose_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_algorithms_transpose_mako'. (couldnt-parse)
  coverage._warn(msg, slug="couldnt-parse")
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_cluda_functions_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_cluda_functions_mako'. (couldnt-parse)
  coverage._warn(msg, slug="couldnt-parse")
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_cluda_kernel_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_cluda_kernel_mako'. (couldnt-parse)
  coverage._warn(msg, slug="couldnt-parse")
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_cluda_vsize_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_cluda_vsize_mako'. (couldnt-parse)
  coverage._warn(msg, slug="couldnt-parse")
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_fft_fft_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_fft_fft_mako'. (couldnt-parse)
  coverage._warn(msg, slug="couldnt-parse")
/opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/coverage/report_core.py:115: CoverageWarning: Couldn't parse '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_fft_fftshift_mako': No source for code: '/home/runner/work/ppafm/ppafm/_opt_hostedtoolcache_Python_3_12_4_x64_lib_python3_12_site_packages_reikna_fft_fftshift_mako'. (couldnt-parse)
  coverage._warn(msg, slug="couldnt-parse")
examples/PTCDA_Hartree_dz2/example_ptcda_hartree.py::example_ptcda_hartree PASSED [100%]

=============================== warnings summary ===============================
../../../../../opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/pytools/persistent_dict.py:59
  /opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/pytools/persistent_dict.py:59: UserWarning: Unable to import recommended hash 'siphash24.siphash13', falling back to 'hashlib.sha256'. Run 'python3 -m pip install siphash24' to install the recommended hash.
    warn("Unable to import recommended hash 'siphash24.siphash13', "

ppafm/cli/plot_results.py:12
  /home/runner/work/ppafm/ppafm/ppafm/cli/plot_results.py:12: MatplotlibDeprecationWarning: Auto-close()ing of figures upon backend switching is deprecated since 3.8 and will be removed in 3.10.  To suppress this warning, explicitly call plt.close('all') first.
    mpl.use("Agg")

tests/test_afmulator.py::test_afmulator_save_load
tests/test_afmulator.py::test_afmulator_save_load
tests/test_afmulator.py::test_afmulator_save_load
examples/PTCDA_Hartree_dz2/example_ptcda_hartree.py::example_ptcda_hartree
examples/PTCDA_Hartree_dz2/example_ptcda_hartree.py::example_ptcda_hartree
  /home/runner/work/ppafm/ppafm/ppafm/fieldFFT.py:232: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
    return np.matrix(Lmat)

examples/PTCDA_Hartree_dz2/example_ptcda_hartree.py::example_ptcda_hartree
  /home/runner/work/ppafm/ppafm/ppafm/fieldFFT.py:18: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
    return np.matrix(lvec[1:])

examples/PTCDA_Hartree_dz2/example_ptcda_hartree.py::example_ptcda_hartree
  /opt/hostedtoolcache/Python/3.12.4/x64/lib/python3.12/site-packages/numpy/matrixlib/defmatrix.py:70: PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.
    return matrix(data, dtype=dtype, copy=False)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

---------- coverage: platform linux, python 3.12.4-final-0 -----------
Coverage JSON written to file coverage.json

================== 12 passed, 9 warnings in 203.48s (0:03:23) ==================
FileRead program: reading Q-0.10K0.50/OutFz.xsf file
XYZ dimensions are 62 202 202
Reading DONE
Error: The operation was canceled.