GenericMappingTools / pygmt

A Python interface for the Generic Mapping Tools.
https://www.pygmt.org
BSD 3-Clause "New" or "Revised" License
758 stars 220 forks source link

Add inline code examples in data processing module docstrings #1686

Open willschlitzer opened 2 years ago

willschlitzer commented 2 years ago

As first discussed in this comment, module docstrings should contain inline code examples to show a working example of the module with the correct syntax without needing to find an example page.

To prevent example functions from being run in tests, add _doctest_skip__ = [MODULE NAME] under the import statements.

Are you willing to help implement and maintain this feature? Yes

maxrjones commented 2 years ago

Regarding the CI issue raised in https://github.com/GenericMappingTools/pygmt/issues/1666#issuecomment-993962742, here are two options:

  1. Use the doctest skip directive to skip particular examples all the time.
  2. Add a new target to the Makefile to invoke pytest without doctest, such that the tests without docstring examples can be run on every PR and doctests could be run occasionally.
willschlitzer commented 2 years ago

@GenericMappingTools/pygmt-maintainers Should this only be for data processing modules that can't be as easily demonstrated as a plotting function, or should there be in-line examples for all of our modules?

My vote is for all of them.

maxrjones commented 2 years ago

Should this only be for data processing modules that can't be as easily demonstrated as a plotting function, or should there be in-line examples for all of our modules?

My vote is for all of them.

Sounds good to me.

weiji14 commented 2 years ago

Should this only be for data processing modules that can't be as easily demonstrated as a plotting function, or should there be in-line examples for all of our modules?

My vote is for all of them.

Sounds good to me.

Ok with doing inline examples for all, but maybe prioritize those that don't already have a gallery example first. E.g. don't do grdgradient until much later.

weiji14 commented 2 years ago
  1. Use the doctest skip directive to skip particular examples all the time.

Just to cross-reference from #1790, @seisman has found a smart way to skip doctests now using pytest-doctestplus and @meghanrjones suggested in https://github.com/GenericMappingTools/pygmt/pull/1790#pullrequestreview-902000046 to document the __doctest_skip__ and make fulltest methods in our contributing guides.

2. Add a new target to the Makefile to [invoke pytest without doctest](https://docs.pytest.org/en/6.2.x/usage.html#disabling-plugins), such that the tests without docstring examples can be run on every PR and doctests could be run occasionally.

Just on this, do we want to run make fulltest (that includes all of these inline example doctests) on a semi-regular basis? E.g. every week or every month?

maxrjones commented 2 years ago

Just on this, do we want to run make fulltest (that includes all of these inline example doctests) on a semi-regular basis? E.g. every week or every month?

I think periodic full tests (e.g., monthly) would be a good idea. A slash command for that workflow could also be helpful for reviewing PRs.

seisman commented 2 years ago

I've merged PR #1790 into the main branch. What we still need to do are:

willschlitzer commented 2 years ago

Looking through the examples in the GMT documentation and PyGMT tests, a few of them call remote datasets without using a PyGMT function (e.g. @Table_5_11_mean.xyz for surface). Should the example just call the remote dataset as is (with the "@" in front calling the file), or should load_sample_data() be updated to include the remote datasets for the in-line example?

maxrjones commented 2 years ago

Looking through the examples in the GMT documentation and PyGMT tests, a few of them call remote datasets without using a PyGMT function (e.g. @Table_5_11_mean.xyz for surface). Should the example just call the remote dataset as is (with the "@" in front calling the file), or should load_sample_data() be updated to include the remote datasets for the in-line example?

For tests, IMO it doesn't matter much. For user-facing examples I think it would be better to update load_sample_data.

willschlitzer commented 2 years ago

Are there any gridded datasets with holes in them to use for an example for grdfill? I can use the code from one of the tests, but it seems counterproductive to demo removing data from a grd file only to add it back in with grdfill.

maxrjones commented 2 years ago

Are there any gridded datasets with holes in them to use for an example for grdfill? I can use the code from one of the tests, but it seems counterproductive to demo removing data from a grd file only to add it back in with grdfill.

Looking through https://oceania.generic-mapping-tools.org/cache/, data_w_nans.nc seems like one option. Below is a picture from the gmt tests that uses that file.

image
willschlitzer commented 2 years ago

Are there any gridded datasets with holes in them to use for an example for grdfill? I can use the code from one of the tests, but it seems counterproductive to demo removing data from a grd file only to add it back in with grdfill.

Looking through https://oceania.generic-mapping-tools.org/cache/, data_w_nans.nc seems like one option. Below is a picture from the gmt tests that uses that file.

image

Thanks @meghanrjones!

maxrjones commented 2 years ago

Are there any gridded datasets with holes in them to use for an example for grdfill? I can use the code from one of the tests, but it seems counterproductive to demo removing data from a grd file only to add it back in with grdfill.

There's now a better file to use - earth_relief_20m_holes.grd, see https://github.com/GenericMappingTools/gmt/pull/6678 and https://github.com/GenericMappingTools/gmtserver-admin/pull/155 for background.