GenericMappingTools / pygmt

A Python interface for the Generic Mapping Tools.
https://www.pygmt.org
BSD 3-Clause "New" or "Revised" License
745 stars 216 forks source link

How to resolve flaky tests resulting from using a single GMT session #1242

Open weiji14 opened 3 years ago

weiji14 commented 3 years ago

Description of the problem

There's been instances of flaky tests in PyGMT's test suite reported in https://github.com/GenericMappingTools/pygmt/issues/1217#issuecomment-825561365. This likely stems from the fact that PyGMT uses a single GMT session (initiated during import pygmt) instead of separate GMT sessions for each figure (see https://github.com/GenericMappingTools/pygmt/pull/327#issuecomment-541782890).

@meghanrjones asked about whether we should stick with using a single GMT session in https://github.com/GenericMappingTools/pygmt/issues/1217#issuecomment-827230998, or use independent sessions per figure

I understand the original logic behind a single GMT session for all tests in https://github.com/GenericMappingTools/pygmt/pull/327#issuecomment-541782890. Still, I don't expect that users will be attempting to use the entire PyGMT library in a single session, which is the goal of the test suite. So I think it would be worth revisiting this decision. Could it be possible to periodically test the examples/tutorials against baseline images to ensure that producing multiple plots in a single session is consistent and have the unit tests each use individual sessions?

Full code that generated the error

Flaky tests are hard to reproduce (that is their definition actually), but in PyGMT's case, can be found e.g. when a single test passing on pytest pygmt/tests/test_somemodule.py fails when ran using make test, or vice versa.

E.g. as reported by @meghanrjones in https://github.com/GenericMappingTools/pygmt/issues/1217#issuecomment-825753479

edit: I have not yet been able to figure out a solution. The two makecpt tests fail if there is a docstring example that imports pygmt and instantiates a figure (e.g., extract_region() in pygmt/clib/session.py and pygmt/src/grdfilter.py) and is tested before pygmt/tests/test_makecpt.py.

Related issues affected by having a single GMT session:

System information

Please paste the output of python -c "import pygmt; pygmt.show_versions()":

PyGMT information:
  version: v0.3.2.dev117+g7466dc31
System information:
  python: 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 16:07:37)  [GCC 9.3.0]
  executable: ~/username/miniconda3/envs/pygmt/bin/python
  machine: Linux-5.4.0-72-generic-x86_64-with-debian-bullseye-sid
Dependency information:
  numpy: 1.17.1
  pandas: 1.2.3
  xarray: 0.17.0
  netCDF4: 1.5.6
  packaging: 20.9
  ghostscript: 9.53.3
  gmt: 6.2.0rc1
GMT library information:
  binary dir: ~/username/miniconda3/envs/pygmt/bin
  cores: 6
  grid layout: rows
  library path: ~/username/miniconda3/envs/pygmt/lib/libgmt.so
  padding: 2
  plugin dir: ~/username/miniconda3/envs/pygmt/lib/gmt/plugins
  share dir: ~/username/miniconda3/envs/pygmt/share/gmt
  version: 6.2.0rc1
weiji14 commented 3 years ago

Ok, the flakiness appears to have been an upstream GMT issue that was fixed in https://github.com/GenericMappingTools/gmt/pull/3344. There are some tests that are wrong but currently passing (i.e. false positives) identified in https://github.com/GenericMappingTools/pygmt/issues/1217#issuecomment-827847551 and https://github.com/GenericMappingTools/pygmt/issues/1217#issuecomment-827847551 that need to be updated once we bump to GMT 6.2.0rc2.

maxrjones commented 3 years ago

The past few flaky tests revealing GMT bugs have convinced me of the usefulness of the current structure, even though it would be nice to have the option to run tests in parallel.

maxrjones commented 2 years ago

We seem to be semi-regularly getting failures on windows-latest - Python 3.7 / NumPy 1.18 with : ..\tests\test_sph2grd.py::test_sph2grd_outgrid FAILED [ 87%] ..\tests\test_sph2grd.py::test_sph2grd_no_outgrid FAILED [ 87%] due to issues with the remote file.

weiji14 commented 2 years ago

We seem to be semi-regularly getting failures on windows-latest - Python 3.7 / NumPy 1.18 with : ..\tests\test_sph2grd.py::test_sph2grd_outgrid FAILED [ 87%] ..\tests\test_sph2grd.py::test_sph2grd_no_outgrid FAILED [ 87%] due to issues with the remote file.

Yes this has been popping up recently, but I don't think this is related to flakiness in a single GMT session since the error is Error: [ERROR]: Libcurl Error: Timeout was reached, so maybe open a separate issue for this.

seisman commented 1 week ago

https://forum.generic-mapping-tools.org/t/memory-temporary-storage-issues/5256 This post is a good example showing that using a single GMT session sometimes causes issues.